By Martin-Karl Lefrançois and Pascal Gautron # NVIDIA Vulkan Ray Tracing Tutorial The focus of this document and the provided code is to showcase a basic integration of ray tracing within an existing Vulkan sample, using the `VK_NV_ray_tracing` extension. Note that for educational purposes all the code is contained in a very small set of files. A real integration would require additional levels of abstraction. [//]: # This may be the most platform independent comment # Environment Setup To get support for VK_NV_ray_tracing, please install an [NVIDIA driver]( with version 416.81 or later, and the [Vulkan SDK]( version or later. The base code for the tutorial is located here: !!! Note: Base Source Code ([Download](/rtx/raytracing/vkrt_helpers/files/ Download the file and extract it. The solution contains the `VkExample1` project that provides a simple framework allowing us to load OBJ files and display them using Vulkan. The project compiles and runs, loading a simple OBJ file and rendering it using the regular Vulkan rasterization. ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRasterCube.png) # Vulkan Ray Tracing Utilities This tutorial covers the usage of the NVIDIA extension for ray tracing, `VK_NV_ray_tracing`. In the following, we will use some utility functions that are abstracting some really verbose implementation. The implementation of those abstractions is fully documented [here](/rtx/raytracing/vkrt_helpers) and should help clarifying the concepts of Vulkan ray tracing. Those utilities are already present in the archive, in the `libs\vulkannv` and `libs\vulkannv\nv_helpers_vk` folders. 1. Add the utility files to the solution 1. Select all `.cpp` of both folders and set the precompiled header flag to `Not Using Precompiled Headers` 1. Add `$(SolutionDir)libs\vulkannv` to the project include `C/C++> General> Additional Include Directories` # Get the existing code to run Go to the `main` function of the `main.cpp` file, find the call to `setupVulkan`. This method has a third parameter defining the extensions to activate, since Vulkan requires an explicit activation of the extensions used by the application. Therefore, we also need to explicitly add the `VK_NV_ray_tracing` extension as well as its dependency `VK_KHR_get_memory_requirements_2`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // #VKRay: Activate the ray tracing extension VkCtx.setupVulkan(window, true, {VK_KHR_SWAPCHAIN_EXTENSION_NAME, VK_NV_RAY_TRACING_EXTENSION_NAME, VK_KHR_GET_MEMORY_REQUIREMENTS_2_EXTENSION_NAME}); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Ray Tracing setup In the `HelloVulkan` class, add the initialization function and a member storing the capabilities of the GPU for ray tracing: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // #VKRay void initRayTracing(); VkPhysicalDeviceRayTracingPropertiesNV m_raytracingProperties = {}; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ At the end of the `hello_vulkan.cpp` file, add the body of the method, which will query the capabilities of the GPU with respect to the ray tracing extension. In particular, it will obtain the maximum recursion depth, ie. the number of nested ray tracing calls that can be performed from a single ray. This can be seen as the number of times a ray can bounce in the scene in a recursive path tracer. Note that for performance purposes, recursion should in practice be kept to a minimum, favoring a loop formulation. The shader header size will be useful when creating the shader binding table in a later section. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // Initialize Vulkan ray tracing // #VKRay void HelloVulkan::initRayTracing() { // Query the values of shaderHeaderSize and maxRecursionDepth in current implementation m_raytracingProperties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_RAY_TRACING_PROPERTIES_NV; m_raytracingProperties.pNext = nullptr; m_raytracingProperties.maxRecursionDepth = 0; m_raytracingProperties.shaderGroupHandleSize = 0; VkPhysicalDeviceProperties2 props; props.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2; props.pNext = &m_raytracingProperties; = {}; vkGetPhysicalDeviceProperties2(VkCtx.getPhysicalDevice(), &props); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## main In the `main.cpp` file, in the `main` function, call the initialization method right after `helloVulkan.updateDescriptorSet();` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // #VKRay helloVulkan.initRayTracing(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As an exercise, when running the program, you can put a breakpoint in the `initRayTracing` method to inspect the resulting values. On a Quadro RTX6000, the maximum recursion depth is 31, and the shader header size is 16. # Acceleration Structure To be efficient, ray tracing requires putting the geometry in an acceleration structure (AS) that will reduce the number of ray-triangle intersection tests during rendering. This structure is divided into a two-level tree. Intuitively, this can directly map to the notion of an object in a scene graph, where the internal nodes of the graph have been collapsed into a single transform matrix for each bottom-level acceleration structure (BLAS) objects. Those BLAS then hold the actual vertex data of each object. However, it is also possible to combine multiple objects within a single bottom-level AS: for that, a single BLAS can be built from multiple vertex buffers, each with its own transform matrix. Note that if an object is instantiated several times within a same BLAS, its geometry will be duplicated. This can particularly be useful to improve performance on static, non-instantiated scene components (as a rule of thumb, the fewer BLAS, the better). For each BLAS, the top-level AS that will contain the object instances, each one with its own transformation matrix. We will start with a single bottom-level AS containing the vertices of the triangle and a top-level AS instancing it once with an identity transform. ![Figure [step]: Acceleration Structure](/sites/default/files/pictures/2019/vulkan_raytracing/AccelerationStructure.svg) In the header file, add the includes to the helper classes for the bottom-level and top-level acceleration structures: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // #VKRay #include "nv_helpers_vk/BottomLevelASGenerator.h" #include "nv_helpers_vk/TopLevelASGenerator.h" #include "nv_helpers_vk/VKHelpers.h" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the `HelloVulkan` class, we will introduce the notion of geometry instances. Those instances store the buffer handles to the vertex and index arrays along with offsets in those, as well as a transform matrix: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C struct GeometryInstance { VkBuffer vertexBuffer; uint32_t vertexCount; VkDeviceSize vertexOffset; VkBuffer indexBuffer; uint32_t indexCount; VkDeviceSize indexOffset; glm::mat4x4 transform; }; void createGeometryInstances(); std::vector m_geometryInstances; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We also declare an acceleration structure storage, which stores the handles to the buffers related to the acceleration structure builder: the scratch memory, the final result, and the instances definitions. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C struct AccelerationStructure { VkBuffer scratchBuffer = VK_NULL_HANDLE; VkDeviceMemory scratchMem = VK_NULL_HANDLE; VkBuffer resultBuffer = VK_NULL_HANDLE; VkDeviceMemory resultMem = VK_NULL_HANDLE; VkBuffer instancesBuffer = VK_NULL_HANDLE; VkDeviceMemory instancesMem = VK_NULL_HANDLE; VkAccelerationStructureNV structure = VK_NULL_HANDLE; }; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We introduce 3 new methods as well: the builder for the bottom-level acceleration structure, the one for the top-level AS, and the global method `createAccelerationStructures` that will generate the acceleration structures for the whole scene. Note that we also add a storage for the top-level AS that will be used when calling the ray tracing, and we also store the helper object for the top-level AS in anticipation of potential AS refitting in a further chapter. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C AccelerationStructure createBottomLevelAS(VkCommandBuffer commandBuffer, std::vector vVertexBuffers); void createTopLevelAS( VkCommandBuffer commandBuffer, const std::vector>& instances, /* pair of bottom level AS and matrix of the instance */ VkBool32 updateOnly); void createAccelerationStructures(); void destroyAccelerationStructure(const AccelerationStructure& as); nv_helpers_vk::TopLevelASGenerator m_topLevelASGenerator; AccelerationStructure m_topLevelAS; std::vector m_bottomLevelAS; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the source file, add the code to generate the geometry instances. Since the simple OBJ loader imports the geometry as a single object, this method is straightforward. Using a more complex scene definition, this should of course be extended. In particular, the instances for ray tracing cannot be used to define a full, multilevel scene graph: the matrices would have to be combined for each instance, so that the resulting scene only contains a flat list of instances, each with its unique matrix. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // Create the instances from the scene data // #VKRay void HelloVulkan::createGeometryInstances() { // The importer always imports the geometry as a single instance, without a // transform. Using a more complex importer, this should be adapted. glm::mat4x4 mat = glm::mat4x4(1,0,0,0, 0,1,0,0, 0,0,1,0, 0,0,0,1); m_geometryInstances.push_back( {m_vertexBuffer, m_nbVertices, 0, m_indexBuffer, m_nbIndices, 0, mat}); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The next step is the construction of the bottom-level acceleration structure, that will hold the actual geometry of the object. This method will enqueue the construction work on a command buffer, and simply takes the geometry definition in the form of vertex and index arrays. Using the `BottomLevelASGenerator` helper, we first register the geometry data: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // // Create a bottom-level acceleration structure based on a list of vertex // buffers in GPU memory along with their vertex count. The build is then done // in 3 steps: gathering the geometry, computing the sizes of the required // buffers, and building the actual AS #VKRay HelloVulkan::AccelerationStructure HelloVulkan::createBottomLevelAS( VkCommandBuffer commandBuffer, std::vector vVertexBuffers) { nv_helpers_vk::BottomLevelASGenerator bottomLevelAS; // Adding all vertex buffers and not transforming their position. for(const auto& buffer : vVertexBuffers) { if(buffer.indexBuffer == VK_NULL_HANDLE) { // No indices bottomLevelAS.AddVertexBuffer(buffer.vertexBuffer, buffer.vertexOffset, buffer.vertexCount, sizeof(Vertex), VK_NULL_HANDLE, 0); } else { // Indexed geometry bottomLevelAS.AddVertexBuffer(buffer.vertexBuffer, buffer.vertexOffset, buffer.vertexCount, sizeof(Vertex), buffer.indexBuffer, buffer.indexOffset, buffer.indexCount, VK_NULL_HANDLE, 0); } } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Then we can create the handle to the acceleration structure, by internally calling `vkCreateAccelerationStructureNV`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C AccelerationStructure buffers; // Once the overall size of the geometry is known, we can create the handle // for the acceleration structure buffers.structure = bottomLevelAS.CreateAccelerationStructure(VkCtx.getDevice(), VK_FALSE); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To build a bottom-level acceleration structure we need to allocate some amount of scratch memory that will be used by the builder to process the geometry, and some memory to hold the final BLAS. Since both are dependent on the scene complexity, we can obtain an estimate of the required spaces by calling `ComputeASBufferSizes`, which calls `vkGetAccelerationStructureMemoryRequirementsNV` internally. We can then allocate the buffers according to the output values. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // The AS build requires some scratch space to store temporary information. // The amount of scratch memory is dependent on the scene complexity. VkDeviceSize scratchSizeInBytes = 0; // The final AS also needs to be stored in addition to the existing vertex // buffers. It size is also dependent on the scene complexity. VkDeviceSize resultSizeInBytes = 0; bottomLevelAS.ComputeASBufferSizes(VkCtx.getDevice(), buffers.structure, &scratchSizeInBytes, &resultSizeInBytes); // Once the sizes are obtained, the application is responsible for allocating // the necessary buffers. Since the entire generation will be done on the GPU, // we can directly allocate those in device local mem nv_helpers_vk::createBuffer(VkCtx.getPhysicalDevice(), VkCtx.getDevice(), scratchSizeInBytes, VK_BUFFER_USAGE_RAY_TRACING_BIT_NV, &buffers.scratchBuffer, &buffers.scratchMem, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT); nv_helpers_vk::createBuffer(VkCtx.getPhysicalDevice(), VkCtx.getDevice(), resultSizeInBytes, VK_BUFFER_USAGE_RAY_TRACING_BIT_NV, &buffers.resultBuffer, &buffers.resultMem, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The BLAS can finally be built using the allocated buffers, where the `generate` call will use `vkBindAccelerationStructureMemoryNV` to bind the allocated memory to the AS, and `vkCmdBuildAccelerationStructureNV` will enqueue the actual build order on the command list. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Build the acceleration structure. Note that this call integrates a barrier // on the generated AS, so that it can be used to compute a top-level AS right // after this method. bottomLevelAS.Generate(VkCtx.getDevice(), commandBuffer, buffers.structure, buffers.scratchBuffer, 0, buffers.resultBuffer, buffers.resultMem, false, VK_NULL_HANDLE); return buffers; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The top-level acceleration structure build follows a similar scheme, but will this time take a vector of instances defined by a bottom-level acceleration structures and a transform. This method also supports dynamic updates that will be used in a later chapter, using the `updateOnly` flag. For now the method will always have this flag set to `false`, forcing a complete build of the TLAS. As for the BLAS, we first add the contents of the TLAS, and then create the acceleration structure handle: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // Create the main acceleration structure that holds all instances of the scene. // Similarly to the bottom-level AS generation, it is done in 3 steps: gathering // the instances, computing the memory requirements for the AS, and building the // AS itself #VKRay void HelloVulkan::createTopLevelAS( VkCommandBuffer commandBuffer, const std::vector>& instances, // pair of bottom level AS and matrix of the instance VkBool32 updateOnly) { if(!updateOnly) { // Gather all the instances into the builder helper for(size_t i = 0; i < instances.size(); i++) { // For each instance we set its instance index to its index i in the // instance vector, and set its hit group index to 2*i. The hit group // index defines which entry of the shader binding table will contain the // hit group to be executed when hitting this instance. We set this index // to i due to the use of 1 type of rays in the scene: the camera rays m_topLevelASGenerator.AddInstance(instances[i].first, instances[i].second, static_cast(i), static_cast(i)); } // Once all instances have been added, we can create the handle for the TLAS m_topLevelAS.structure = m_topLevelASGenerator.CreateAccelerationStructure(VkCtx.getDevice(), VK_TRUE); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We can then estimate the memory requirements for the build similarly to the BLAS. However, we now have an additional buffer that will store the definition of the instances: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // As for the bottom-level AS, the building the AS requires some scratch // space to store temporary data in addition to the actual AS. In the case // of the top-level AS, the instance descriptors also need to be stored in // GPU memory. This call outputs the memory requirements for each (scratch, // results, instance descriptors) so that the application can allocate the // corresponding memory VkDeviceSize scratchSizeInBytes, resultSizeInBytes, instanceDescsSizeInBytes; m_topLevelASGenerator.ComputeASBufferSizes(VkCtx.getDevice(), m_topLevelAS.structure, &scratchSizeInBytes, &resultSizeInBytes, &instanceDescsSizeInBytes); // Create the scratch and result buffers. Since the build is all done on // GPU, those can be allocated in device local memory nv_helpers_vk::createBuffer(VkCtx.getPhysicalDevice(), VkCtx.getDevice(), scratchSizeInBytes, VK_BUFFER_USAGE_RAY_TRACING_BIT_NV, &m_topLevelAS.scratchBuffer, &m_topLevelAS.scratchMem, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT); nv_helpers_vk::createBuffer(VkCtx.getPhysicalDevice(), VkCtx.getDevice(), resultSizeInBytes, VK_BUFFER_USAGE_RAY_TRACING_BIT_NV, &m_topLevelAS.resultBuffer, &m_topLevelAS.resultMem, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT); // The buffer describing the instances: ID, shader binding information, // matrices ... Those will be copied into the buffer by the helper through // mapping, so the buffer has to be allocated in host visible memory. nv_helpers_vk::createBuffer(VkCtx.getPhysicalDevice(), VkCtx.getDevice(), instanceDescsSizeInBytes, VK_BUFFER_USAGE_RAY_TRACING_BIT_NV, &m_topLevelAS.instancesBuffer, &m_topLevelAS.instancesMem, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From the instances definitions and the allocated buffers, we can now call the generation method that will use `vkCmdBuildAccelerationStructureNV` to enqueue the AS build on the command buffer: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // After all the buffers are allocated, or if only an update is required, we // can build the acceleration structure. Note that in the case of the update // we also pass the existing AS as the 'previous' AS, so that it can be // refitted in place. Build the acceleration structure. Note that this call // integrates a barrier on the generated AS, so that it can be used to compute // a top-level AS right after this method. m_topLevelASGenerator.Generate(VkCtx.getDevice(), commandBuffer, m_topLevelAS.structure, m_topLevelAS.scratchBuffer, 0, m_topLevelAS.resultBuffer, m_topLevelAS.resultMem, m_topLevelAS.instancesBuffer, m_topLevelAS.instancesMem, updateOnly, updateOnly ? m_topLevelAS.structure : VK_NULL_HANDLE); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using the above methods, we can implement the construction of the full scene representation for ray tracing. Since the builder calls require a command buffer, we allocate one that will only be used for that purpose: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // Create the bottom-level and top-level acceleration structures // #VKRay void HelloVulkan::createAccelerationStructures() { // Create a one-time command buffer in which the AS build commands will be // issued VkCommandBufferAllocateInfo commandBufferAllocateInfo; commandBufferAllocateInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO; commandBufferAllocateInfo.pNext = nullptr; commandBufferAllocateInfo.commandPool = VkCtx.getCommandPool()[VkCtx.getFrameIndex()]; commandBufferAllocateInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY; commandBufferAllocateInfo.commandBufferCount = 1; VkCommandBuffer commandBuffer = VK_NULL_HANDLE; VkResult code = vkAllocateCommandBuffers(VkCtx.getDevice(), &commandBufferAllocateInfo, &commandBuffer); if(code != VK_SUCCESS) { throw std::logic_error("rt vkAllocateCommandBuffers failed"); } VkCommandBufferBeginInfo beginInfo; beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO; beginInfo.pNext = nullptr; beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT; beginInfo.pInheritanceInfo = nullptr; vkBeginCommandBuffer(commandBuffer, &beginInfo); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Then, for each instance in the scene, we generate the bottom-level acceleration structure: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // For each geometric object, we compute the corresponding bottom-level // acceleration structure (BLAS) m_bottomLevelAS.resize(m_geometryInstances.size()); std::vector> instances; for(size_t i = 0; i < m_geometryInstances.size(); i++) { m_bottomLevelAS[i] = createBottomLevelAS( commandBuffer, { {m_geometryInstances[i].vertexBuffer, m_geometryInstances[i].vertexCount, m_geometryInstances[i].vertexOffset, m_geometryInstances[i].indexBuffer, m_geometryInstances[i].indexCount, m_geometryInstances[i].indexOffset}, }); instances.push_back({m_bottomLevelAS[i].structure, m_geometryInstances[i].transform}); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The top-level AS is then created using the above instance definitions. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Create the top-level AS from the previously computed BLAS createTopLevelAS(commandBuffer, instances, VK_FALSE); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We finalize the build by submitting the command buffer, waiting for it to complete, and freeing our one-time command buffer. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // End the command buffer and submit it vkEndCommandBuffer(commandBuffer); VkSubmitInfo submitInfo; submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO; submitInfo.pNext = nullptr; submitInfo.waitSemaphoreCount = 0; submitInfo.pWaitSemaphores = nullptr; submitInfo.pWaitDstStageMask = nullptr; submitInfo.commandBufferCount = 1; submitInfo.pCommandBuffers = &commandBuffer; submitInfo.signalSemaphoreCount = 0; submitInfo.pSignalSemaphores = nullptr; vkQueueSubmit(VkCtx.getQueue(), 1, &submitInfo, VK_NULL_HANDLE); vkQueueWaitIdle(VkCtx.getQueue()); vkFreeCommandBuffers(VkCtx.getDevice(), VkCtx.getCommandPool()[VkCtx.getFrameIndex()], 1, &commandBuffer); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All those resources have to be cleaned up when closing the application. The implementation of `destroyAccelerationStructure` simplifies that task: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // Destroys an acceleration structure and all the resources associated to it void HelloVulkan::destroyAccelerationStructure(const AccelerationStructure& as) { vkDestroyBuffer(VkCtx.getDevice(), as.scratchBuffer, nullptr); vkFreeMemory(VkCtx.getDevice(), as.scratchMem, nullptr); vkDestroyBuffer(VkCtx.getDevice(), as.resultBuffer, nullptr); vkFreeMemory(VkCtx.getDevice(), as.resultMem, nullptr); vkDestroyBuffer(VkCtx.getDevice(), as.instancesBuffer, nullptr); vkFreeMemory(VkCtx.getDevice(), as.instancesMem, nullptr); vkDestroyAccelerationStructureNV(VkCtx.getDevice(), as.structure, nullptr); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This method can then be called at the end of `destroyResources`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // #VKRay destroyAccelerationStructure(m_topLevelAS); for(auto& as : m_bottomLevelAS) destroyAccelerationStructure(as); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## main In the `main` function, we can now add the creation of the geometry instances and acceleration structures right after initializing the ray tracing extension: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // #VKRay helloVulkan.initRayTracing(); helloVulkan.createGeometryInstances(); helloVulkan.createAccelerationStructures(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Ray Tracing Descriptor Set The ray tracing shaders, like the rasterization shaders, use external resources grouped into a descriptor set. A key difference, however, is that in a scene requiring several types of shaders, the rasterization would allow each set of shaders to have their own descriptor set(s). For example, objects with different materials may each have a descriptor set containing the handles of the textures it needs. This is easily done since for a given material, we would create its corresponding rasterization pipeline and use that pipeline to render all the objects with that material. On the contrary, with ray tracing it is not possible to know in advance which objects will be hit by a ray, so any shader may be invoked at any time. The Vulkan ray tracing extension then uses a single descriptor set containing all the resources necessary to render the scene: for example, it would contain all the textures for all the materials. In the class definition, we will add a method to create that descriptor set, as well as thestorage for the descriptor pool, layout and set itself. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C void createRaytracingDescriptorSet(); void updateRaytracingRenderTarget(VkImageView target); nv_helpers_vk::DescriptorSetGenerator m_rtDSG; VkDescriptorPool m_rtDescriptorPool; VkDescriptorSetLayout m_rtDescriptorSetLayout; VkDescriptorSet m_rtDescriptorSet; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For simplicity we will use a helper class for generating the descriptor pool, layout and set, by adding this include after the other includes of the source file: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C #include "nv_helpers_vk/DescriptorSetGenerator.h" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The implementation of the descriptor set first makes sure the geometry data has finished uploading on the GPU using a barrier: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // Create the descriptor set used by the raytracing shaders: note that all // shaders will access the same descriptor set, and therefore the set needs to // contain all the resources used by the shaders. For example, it will contain // all the textures used in the scene. void HelloVulkan::createRaytracingDescriptorSet() { // We will bind the vertex and index buffers, so we first add a barrier on // those buffers to make sure their data is actually present on the GPU VkBufferMemoryBarrier bmb = {}; bmb.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER; bmb.pNext = nullptr; bmb.srcAccessMask = 0; bmb.dstAccessMask = VK_ACCESS_SHADER_READ_BIT; bmb.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED; bmb.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED; bmb.offset = 0; bmb.size = VK_WHOLE_SIZE; VkCommandBuffer commandBuffer = VkCtx.beginSingleTimeCommands(); bmb.buffer = m_vertexBuffer; vkCmdPipelineBarrier(commandBuffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, 0, 0, nullptr, 1, &bmb, 0, nullptr); bmb.buffer = m_indexBuffer; vkCmdPipelineBarrier(commandBuffer, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, 0, 0, nullptr, 1, &bmb, 0, nullptr); VkCtx.endSingleTimeCommands(commandBuffer); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The descriptor set generation helper first requires the definition of all the bindings: the binding point as defined in the `layout(binding = xx)` of the shaders, the number of descriptor for that binding location, the descriptor type and in which shader the descriptor(s) would be used. Our ray tracing shaders will, arbitrarily, use location 0 for the top-level acceleration structure, 1 for the ray tracing output, 2 for the camera information, 3-4 for the geometry data (that will be used for shading), 5 for the material information, and 6 for the material textures. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Add the bindings to the resources // Top-level acceleration structure, usable by both the ray generation and the // closest hit (to shoot shadow rays) m_rtDSG.AddBinding(0, 1, VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV, VK_SHADER_STAGE_RAYGEN_BIT_NV); // Raytracing output m_rtDSG.AddBinding(1, 1, VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, VK_SHADER_STAGE_RAYGEN_BIT_NV); // Camera information m_rtDSG.AddBinding(2, 1, VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, VK_SHADER_STAGE_RAYGEN_BIT_NV); // Scene data // Vertex buffer m_rtDSG.AddBinding(3, 1, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV); // Index buffer m_rtDSG.AddBinding(4, 1, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV); // Material buffer m_rtDSG.AddBinding(5, 1, VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV); // Textures m_rtDSG.AddBinding(6, static_cast(m_textureSampler.size()), VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From this information we can create the descriptor pool, layout and a descriptor set. Internally, those methods use the `vkCreateDescriptorPool`, `vkCreateDescriptorSetLayout` and `vkAllocateDescriptorSets` calls. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Create the descriptor pool and layout m_rtDescriptorPool = m_rtDSG.GeneratePool(VkCtx.getDevice()); m_rtDescriptorSetLayout = m_rtDSG.GenerateLayout(VkCtx.getDevice()); // Generate the descriptor set m_rtDescriptorSet = m_rtDSG.GenerateSet(VkCtx.getDevice(), m_rtDescriptorPool, m_rtDescriptorSetLayout); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Once the descriptor set has been allocated, we fill it with the actual descriptors using the `Bind` method, which takes a `VkDescriptor*Info` (or a `VkWriteDescriptorSetAccelerationStructNV` for acceleration structures) defining the resource being bound to each location. Here, we bind the top-level acceleration structure: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Bind the actual resources into the descriptor set // Top-level acceleration structure VkWriteDescriptorSetAccelerationStructureNV descriptorAccelerationStructureInfo; descriptorAccelerationStructureInfo.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET_ACCELERATION_STRUCTURE_NV; descriptorAccelerationStructureInfo.pNext = nullptr; descriptorAccelerationStructureInfo.accelerationStructureCount = 1; descriptorAccelerationStructureInfo.pAccelerationStructures = &m_topLevelAS.structure; m_rtDSG.Bind(m_rtDescriptorSet, 0, {descriptorAccelerationStructureInfo}); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The other resources follow the same pattern, so that the output buffer, camera matrices, geometry data, material definition and textures are bound in the descriptor set. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Camera matrices VkDescriptorBufferInfo camInfo = {}; camInfo.buffer = m_uniformBuffer; camInfo.offset = 0; camInfo.range = sizeof(UniformBufferObject); m_rtDSG.Bind(m_rtDescriptorSet, 2, {camInfo}); // Vertex buffer VkDescriptorBufferInfo vertexInfo = {}; vertexInfo.buffer = m_vertexBuffer; vertexInfo.offset = 0; vertexInfo.range = VK_WHOLE_SIZE; m_rtDSG.Bind(m_rtDescriptorSet, 3, {vertexInfo}); // Index buffer VkDescriptorBufferInfo indexInfo = {}; indexInfo.buffer = m_indexBuffer; indexInfo.offset = 0; indexInfo.range = VK_WHOLE_SIZE; m_rtDSG.Bind(m_rtDescriptorSet, 4, {indexInfo}); // Material buffer VkDescriptorBufferInfo materialInfo = {}; materialInfo.buffer = m_matColorBuffer; materialInfo.offset = 0; materialInfo.range = VK_WHOLE_SIZE; m_rtDSG.Bind(m_rtDescriptorSet, 5, {materialInfo}); // Textures std::vector imageInfos; for(size_t i = 0; i < m_textureSampler.size(); ++i) { VkDescriptorImageInfo imageInfo = {}; imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL; imageInfo.imageView = m_textureImageView[i]; imageInfo.sampler = m_textureSampler[i]; imageInfos.push_back(imageInfo); } if(!m_textureSampler.empty()) { m_rtDSG.Bind(m_rtDescriptorSet, 6, imageInfos); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The actual binding is done by calling `UpdateSetContents`, which will invoke `vkUpdateDescriptorSets` to write the binding data into the descriptor set: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Copy the bound resource handles into the descriptor set m_rtDSG.UpdateSetContents(VkCtx.getDevice(), m_rtDescriptorSet); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In this binding description, we see that the raytracing shaders will access the vertex and index buffers as storage buffer. This will be required to fetch the vertex coordinates and attributes in the raytracing shaders. To allow this, we first need to add `VK_BUFFER_USAGE_STORAGE_BUFFER_BIT` to the buffer usage of the device-local vertex buffer, declared in `createVertexBuffer`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C VkCtx.createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, m_vertexBuffer, m_vertexBufferMemory); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Similarly, we add that usage flag to the index buffer in `createIndexBuffer`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C VkCtx.createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT | VK_BUFFER_USAGE_INDEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, m_indexBuffer, m_indexBufferMemory); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For each frame, we will have to update the buffer in which the raytracing shaders will write, just like rasterization: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C void HelloVulkan::updateRaytracingRenderTarget(VkImageView target) { // Output buffer VkDescriptorImageInfo descriptorOutputImageInfo; descriptorOutputImageInfo.sampler = nullptr; descriptorOutputImageInfo.imageView = target; descriptorOutputImageInfo.imageLayout = VK_IMAGE_LAYOUT_GENERAL; m_rtDSG.Bind(m_rtDescriptorSet, 1, {descriptorOutputImageInfo}); // Copy the bound resource handles into the descriptor set m_rtDSG.UpdateSetContents(VkCtx.getDevice(), m_rtDescriptorSet); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The resources created in this section need to be destroyed when closing the application by adding the following to `destroyResources`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C vkDestroyDescriptorSetLayout(VkCtx.getDevice(), m_rtDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(VkCtx.getDevice(), m_rtDescriptorPool, nullptr); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## main In the `main` function, add the creation of the descriptor set after the other ray tracing calls: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C helloVulkan.createRaytracingDescriptorSet(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Ray Tracing Pipeline When creating rasterization shaders with Vulkan, the application compiles them into executable shaders, which are bound to the rasterization pipeline. All objects rendered using this pipeline will use those shaders. To render an image with several types of shaders, the rasterization pipeline needs to be set to use each before calling the draw commands. In a ray tracing context, a ray traced to the scene can hit any object and thus trigger the execution of any shader. Instead of using one shader executable at a time, we now need to have all shaders available at once. The pipeline then contains all the shader required to render the scene, and information on how to execute it. To be able to raytrace some geometry, the Vulkan ray tracing extension requires at least 3 shader programs: * The ray generation program, that will be the starting point of the ray tracing, and called for each pixel: it will typically initialize a ray starting at the location of the camera, in a direction given by evaluating the camera lens model at the pixel location. It will then invoke `traceNV()`, that will shoot the ray in the scene. Other shaders below will process further events, and return their result to the ray generation shader through the ray payload. * The miss shader is executed when a ray does not intersect any geometry. It can typically sample an environment map, or return a simple color through the ray payload. * The closest hit shader is called upon hitting a the geometric instance closest to the starting point of the ray. This shader can for example perform lighting calculations, and return the results through the ray payload. There can be as many closest hit shaders as needed, in the same spirit as a rasterization-based application has multiple pixel shaders depending on the objects. Two more shader types can optionally be used: * The intersection shader, which allows intersecting user-defined geometry. For example, this can be particularly useful when intersecting procedural geometry or subdivision surfaces without tesselating them beforehand. Using this shader requires modifying how the acceleration structures are built, and is beyond the scope of this tutorial. We will instead rely on the built-in triangle intersection shader provided by the extension, which returns 2 floating-point values representing the barycentric coordinates `(u,v)` of the hit point inside the triangle. For a triangle made of vertices `v0`, `v1`, `v2`, the barycentric coordinates define the weights of the vertices as follows: ********************** * . u * * / \ * * / v1\ * * / \ * * / \ * * 1-u-v/ v0 v2 \ v * * '-----------' * ********************** * The any hit shader is executed on each potential intersection: when searching for the hit point closest to the ray origin, several candidates may be found on the way. The any hit shader can typically be used to efficiently implement alpha-testing. If the alpha test fails, the ray traversal can continue without having to call `traceNV()` again. The built-in any hit shader is simply a pass-trough returning the intersection to the traversal engine that will determine which potential intersection is the closest. ![Figure [step]: The Ray Tracing Pipeline](/sites/default/files/pictures/2019/vulkan_raytracing/ShaderPipeline.svg) In this tutorial we will create a pipeline containing only the 3 mandatory shader programs: a single ray generation, single miss and a single closest hit. This is done by first compiling each GLSL shader program into SPIR-V. The SPIR-V shaders will be linked together within the raytracing pipeline, which will be able to route the intersection calculations to the right hit shaders. To be able to focus on the pipeline generation, we provide simplistic shaders: !!! Note: Shaders ([Download](/rtx/raytracing/vkrt_helpers/files/ Download the shaders and extract the content to the project folder. The `shaders` folder now contains 3 more files: * `rayGen.rgen` contains the ray generation program. It also declares its access to the ray tracing output buffer `image`, and the ray tracing acceleration structure `topLevelAS`, bound as a `accelerationStructureNV`. For now this shader program simply writes a constant color in the ray tracing output buffer. * `miss.rmiss` defines the Miss shader. This shader will be executed when no geometry is hit, and will write a constant color in the ray payload `rayPayloadInNV`, which is provided automatically. Since our current ray generation program does not trace any ray for now, this shader will not be called. * `closestHit.rchit` contains a very simple closest hit shader. It will be executed upon hitting the geometry (our triangle). As the miss shader, it takes the ray payload `rayPayloadInNV`. It also has a second input defining the intersection attributes `hitAttributeNV` as provided by the intersection shader, ie. the barycentric coordinates. This shader simply writes a constant color to the payload. In order to use the GLSL compiler for our new shaders, we need to modify the properties of the project to use the `GLSLValidateVS.props` we just extracted from the archive instead of the default ones. For this, go to the 'Property Manager', right-click on the project name, choose 'Add existing property sheet' and choose `GLSLValidateVS.props`. Add the shader files in the `shaders` filter of the Visual Studio project. For each, make sure that their properties page says the `Item Type` is `GLSL Validator`. All the shader files should compile, and the resulting SPIR-V files are stored in the `shaders` folder alongside the GLSL files. In the header file, add the definition of the ray tracing pipeline building method, and the storage members of the pipeline: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C void createRaytracingPipeline(); VkPipelineLayout m_rtPipelineLayout = VK_NULL_HANDLE; VkPipeline m_rtPipeline = VK_NULL_HANDLE; uint32_t m_rayGenIndex; uint32_t m_hitGroupIndex; uint32_t m_missIndex; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ After the other includes of the source file, add the include for the ray tracing pipeline generation helper: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C #include "nv_helpers_vk/RaytracingPipelineGenerator.h" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The implementation of the ray tracing pipeline generation starts by adding the ray generation and miss shader stages, although this could be done in arbitrary order. When setting up the stages of the pipeline, each addition call returns the index of the stage in the pipeline, which will be later used to associate the pipeline stages to actual geometries and provide entry points for the ray tracing process. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // Create the raytracing pipeline, containing the handles and data for each // raytracing shader For each shader or hit group we retain its index, so that // they can be bound to the geometry in the shader binding table. void HelloVulkan::createRaytracingPipeline() { nv_helpers_vk::RayTracingPipelineGenerator pipelineGen; // We use only one ray generation, that will implement the camera model VkShaderModule rayGenModule = VkCtx.createShaderModule(readFile("shaders/raygen.spv")); m_rayGenIndex = pipelineGen.AddRayGenShaderStage(rayGenModule); // The first miss shader is used to look-up the environment in case the rays // from the camera miss the geometry VkShaderModule missModule = VkCtx.createShaderModule(readFile("shaders/miss.spv")); m_missIndex = pipelineGen.AddMissShaderStage(missModule); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All the shaders related to the intersection of a given object type are grouped into a hit group. At most, a hit group can contain an intersection shader, an any hit shader, and a closest hit shader. For simplicity, we use the built-in intersection and any-hit shaders, which leaves our hit group with only one entry, the closest hit shader: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // The first hit group defines the shaders invoked when a ray shot from the // camera hit the geometry. In this case we only specify the closest hit // shader, and rely on the build-in triangle intersection and pass-through // any-hit shader. However, explicit intersection and any hit shaders could be // added as well. m_hitGroupIndex = pipelineGen.StartHitGroup(); VkShaderModule closestHitModule = VkCtx.createShaderModule(readFile("shaders/closesthit.spv")); pipelineGen.AddHitShaderStage(closestHitModule, VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV); pipelineGen.EndHitGroup(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From those 3 shaders and an indication of the maximum possible recursion level we want to use, we can now generate the pipeline, which internally calls `vkCreatePipelineLayout` and `vkCreateRaytracingPipelinesNV`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // The raytracing process now can only shoot rays from the camera, hence a // recursion level of 1. This number should be kept as low as possible for // performance reasons. Even recursive raytracing should be flattened into a // loop in the ray generation to avoid deep recursion. pipelineGen.SetMaxRecursionDepth(1); // Generate the pipeline pipelineGen.Generate(VkCtx.getDevice(), m_rtDescriptorSetLayout, &m_rtPipeline, &m_rtPipelineLayout); vkDestroyShaderModule(VkCtx.getDevice(), rayGenModule, nullptr); vkDestroyShaderModule(VkCtx.getDevice(), missModule, nullptr); vkDestroyShaderModule(VkCtx.getDevice(), closestHitModule, nullptr); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The pipeline layout and the pipeline itself also have to be cleaned up upon closing, hence we add this to `destroyResources`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C vkDestroyPipelineLayout(VkCtx.getDevice(), m_rtPipelineLayout, nullptr); vkDestroyPipeline(VkCtx.getDevice(), m_rtPipeline, nullptr); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## main In the `main` function, add the pipeline construction after the other ray tracing calls: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C helloVulkan.createRaytracingPipeline(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Shader Binding Table The Shader Binding Table is where all programs and TLAS are bind together to know which program to execute. There is one ray generation, at least one miss, and a number of hit groups. There should be *n* hit group entries, up to the maximum index passed to the instance description parameter `instanceOffset`. In a typical rasterization setup, a current shader and its associated resources are bound prior to drawing the corresponding objects, then another shader and resource set can be bound for some other objects, and so on. Since ray tracing can hit any surface of the scene at any time, it is impossible to know in advance which shaders need to be bound. Therefore, the Shader Binding Table (SBT) is an array of SBT entries holding information on the location of shaders and their resources for each object. ## SBT Entry A SBT entry consists of a header and data section. The header stores a shader identifier, while the data section provides pushconstant data. In the header file add the following include ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C #include "nv_helpers_vk/ShaderBindingTableGenerator.h" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This file contains our SBT [helper]( that eases the SBT creation process and enforces consistency between the SBT layout and the later ray tracing calls. Internally the `Add*` methods collect the names of the shader programs associated with the pointers of their input resources in GPU memory. The `Generate` call maps the input buffer and, for each collected entry, sets the corresponding shader identifier using `vkGetRayTracingShaderGroupHandlesNV` and copies its resource pointers afterwards. The helper first copies the ray generation programs, then the miss programs, and finally the hit groups. And add the following declarations: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C void createShaderBindingTable(); nv_helpers_vk::ShaderBindingTableGenerator m_sbtGen; VkBuffer m_shaderBindingTableBuffer; VkDeviceMemory m_shaderBindingTableMem; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Shader Binding Table (SBT) is the cornerstone of the ray tracing setup: it links the geometry instances to their corresponding hit groups, and binds the pushconstant values to the ray tracing shader programs. In this tutorial, we have a scene containing a single instance. The Shader Binding Table would then have 3 entries: one for the ray generation program, one for the miss program, and one for the hit group. None of them are using pushconstant values, therefore the SBT is then laid out as follows: ****************** *+--------------+* *| RayGen |* *| Identifier |* *+--------------+* *| Miss |* *| Identifier |* *+--------------+* *| HitGroup |* *| Identifier |* *+--------------+* ****************** When starting the ray tracing process, the identifier of the ray generation program will be used to execute its entry point for each pixel. When the ray generation program shoots a ray, the descriptor set will be used to find the location of the top-level acceleration structure in GPU memory and trigger the tracing itself. The ray may miss all geometry, in which case the SBT will be used to find the miss shader identifier and execute the corresponding code. If the ray hits the geometry, the hit group identifier will be used to find the shaders associated to the hit group: intersection, any hit and closest hit. In order, those shaders will be executed, and the result sent to the ray generation shader. The ray generation shader can then access the ray tracing output buffer from the descriptor set, and write its result. If the scene contains several objects, with different hit groups, the SBT will contain all the hit groups and their pushconstant values. As an example, we could have 3 objects, each accessing some camera data in the descriptor set. Objects 0 and 1 would have each their own texture index, while Object 2 would not have one. The SBT would then have this structure: ****************** *+--------------+* *| RayGen |* *| Identifier |* *+--------------+* *| Miss |* *| Identifier |* *+--------------+* *| HitGroup0 |* *| Identifier |* *+--------------+* *| Texture0 |* *| Index |* *+--------------+* *| HitGroup1 |* *| Identifier |* *+--------------+* *| Texture1 |* *| Index |* *+--------------+* *| HitGroup2 |* *| Identifier |* *+--------------+* *| // |* *| |* *+--------------+* ****************** Note that `HitGroup2` does not needs any texture index. However, the alignment requirements of the SBT force each program type (ray generation, miss, hit group) to have a fixed entry size for all of its members. The size of the entry for a given program type is then driven by the maximum number of pushconstant values within that type: 0 for the ray generation, 0 for the miss, and 1 for the hit group. Therefore, the SBT entry is padded to respect the alignment. In many practical the ray tracing process uses multiple ray types, for example to differentiate between regular rays and shadow rays. In such cases, the SBT would contain one hit group per ray type, for each object type. Going back to a sample with a single object for conciseness, adding a second ray type simply requires adding the corresponding hit group in the SBT: ****************** *+--------------+* *| RayGen |* *| Identifier |* *+--------------+* *| Miss |* *| Identifier |* *+--------------+* *| HitGroup |* *| Identifier |* *+--------------+* *| ShadowGroup |* *| Identifier |* *+--------------+* ****************** How the pipeline associates a geometry with a hit group depends on the hit group index used when adding an instance to the top-level AS helper class. Internally, this index maps to the `instanceOffset` of the `VkGeometryInstance` object. Getting back to our sample, none of our shaders use pushconstant values. We then simply add the shader program indices to the SBT, without any additional values: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C //-------------------------------------------------------------------------------------------------- // Create the shader binding table, associating the geometry to the indices of the shaders in the // ray tracing pipeline void HelloVulkan::createShaderBindingTable() { // Add the entry point, the ray generation program m_sbtGen.AddRayGenerationProgram(m_rayGenIndex, {}); // Add the miss shader for the camera rays m_sbtGen.AddMissProgram(m_missIndex, {}); // For each instance, we will have 1 hit group for the camera rays. // When setting the instances in the top-level acceleration structure we indicated the index // of the hit group in the shader binding table that will be invoked. // Add the hit group defining the behavior upon hitting a surface with a camera ray m_sbtGen.AddHitGroup(m_hitGroupIndex, {}); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From the number of shaders, we can query the helper object to know the size of the SBT, and allocate the corresponding buffer as host visible: the helper will use mapping to write the contents of the SBT. As an exercise, it is then possible to copy the data to a GPU-only buffer for performance. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Compute the required size for the SBT VkDeviceSize shaderBindingTableSize = m_sbtGen.ComputeSBTSize(m_raytracingProperties); // Allocate mappable memory to store the SBT nv_helpers_vk::createBuffer(VkCtx.getPhysicalDevice(), VkCtx.getDevice(), shaderBindingTableSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT, &m_shaderBindingTableBuffer, &m_shaderBindingTableMem, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The `Generate` call will populate the Shader Binding Table: in our case it will simply copy the shader identifiers obtained by `vkGetRayTracingShaderGroupHandlesNV` into the target buffer. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Generate the SBT using mapping. For further performance a staging buffer should be used, so // that the SBT is guaranteed to reside on GPU memory without overheads. m_sbtGen.Generate(VkCtx.getDevice(), m_rtPipeline, m_shaderBindingTableBuffer, m_shaderBindingTableMem); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As with other resources, we destroy the SBT in `destroyResources`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C vkDestroyBuffer(VkCtx.getDevice(), m_shaderBindingTableBuffer, nullptr); vkFreeMemory(VkCtx.getDevice(), m_shaderBindingTableMem, nullptr); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## main In the `main` function, add the construction of the Shader Binding Table: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C helloVulkan.createShaderBindingTable(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Let's Raytrace! Now we have everything setup to be able to trace rays: the acceleration structure, the descriptor set, the ray tracing pipeline and the shader binding table. Let's try to make images from this. ## main In the `main` function, we will define a local variable that we will use to decide whether we want to rasterize or raytrace our scene. Add the following right after the ray tracing initialization calls: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C bool use_raster_render = true; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the same function, find the line `ImGui::ColorEdit3(` and, right after that call, add ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C ImGui::Checkbox("Raster mode", &use_raster_render); // Switch between raster and ray tracing ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A few lines below, you can find a `if (1==1)` block containing the rasterization calls. Replace that condition by ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C if (use_raster_render) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ At the end of that `if` block, we can now put a `else` block that will call ray tracing. As explained before, ray tracing cannot write directly to the render target, and uses another buffer for that purpose. This ray tracing output buffer then needs to be copied into the render target. We start the ray tracing path by ensuring the ray tracing output image can be written using a barrier: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C else { VkClearValue clearColor = {0.0f, 0.5f, 0.0f, 1.0f}; VkImageSubresourceRange subresourceRange; subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT; subresourceRange.baseMipLevel = 0; subresourceRange.levelCount = 1; subresourceRange.baseArrayLayer = 0; subresourceRange.layerCount = 1; nv_helpers_vk::imageBarrier(cmdBuff, VkCtx.getCurrentBackBuffer(), subresourceRange, 0, VK_ACCESS_SHADER_WRITE_BIT, VK_IMAGE_LAYOUT_PRESENT_SRC_KHR, VK_IMAGE_LAYOUT_GENERAL); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We then bind the ray tracing pipeline and the descriptor set to allow the shaders to access their resources: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C helloVulkan.updateRaytracingRenderTarget(VkCtx.getCurrentBackBufferView()); VkCtx.beginRenderPass(); vkCmdBindPipeline(cmdBuff, VK_PIPELINE_BIND_POINT_RAY_TRACING_NV, helloVulkan.m_rtPipeline); vkCmdBindDescriptorSets(cmdBuff, VK_PIPELINE_BIND_POINT_RAY_TRACING_NV, helloVulkan.m_rtPipelineLayout, 0, 1, &helloVulkan.m_rtDescriptorSet, 0, nullptr); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The ray tracing itself can then be added on the command buffer by calling `vkCmdTraceRaysNV`. This function requires the shader binding table, and offsets to indicate where the ray generation, miss and hit groups can be found. Note that our helper puts all shaders in a single SBT, but the API allows having one table per shader type. For each type, we also provide the stride, which is the size of a SBT entry for that type. The offsets and strides are provided by the SBT generation helper to ensure consistenty. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C VkDeviceSize rayGenOffset = helloVulkan.m_sbtGen.GetRayGenOffset(); VkDeviceSize missOffset = helloVulkan.m_sbtGen.GetMissOffset(); VkDeviceSize missStride = helloVulkan.m_sbtGen.GetMissEntrySize(); VkDeviceSize hitGroupOffset = helloVulkan.m_sbtGen.GetHitGroupOffset(); VkDeviceSize hitGroupStride = helloVulkan.m_sbtGen.GetHitGroupEntrySize(); vkCmdTraceRaysNV(cmdBuff, helloVulkan.m_shaderBindingTableBuffer, rayGenOffset, helloVulkan.m_shaderBindingTableBuffer, missOffset, missStride, helloVulkan.m_shaderBindingTableBuffer, hitGroupOffset, hitGroupStride, VK_NULL_HANDLE, 0, 0, helloVulkan.m_framebufferSize.width, helloVulkan.m_framebufferSize.height, 1); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We should now be able to alternate between rasterization and ray tracing. However, the ray tracing result only renders a flat gray image: the simplistic ray generation shader does not trace any ray yet, and simply returns a fixed color. Raster | | Ray-trace :-----------------------------:|:---:|:--------------------------------: ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRasterCube.png width="350px") | <-> | ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRaytraceEmptyCube.png width="350px") # Camera Setup In the context of rasterization, the vertices of the objects are projected from their world-space position into a $[0,1]\times[0,1]\times[0,1]$ cube, before being rasterized on the XY plane. For ray tracing, we need to initialize some rays at the camera position, and intersect the geometry in world space. To achieve this, we need to store the inverse view and projection matrices. In the `UniformBufferObject` at the beginning of the `hello_vulkan.cpp` file, add the inverse matrices so that the structure now is: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C struct UniformBufferObject { glm::mat4 model; glm::mat4 view; glm::mat4 proj; glm::mat4 modelIT; // #VKRay glm::mat4 viewInverse; glm::mat4 projInverse; }; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## updateUniformBuffer The computation of the matrix inverses is done in `updateUniformBuffer`, after setting the `ubo.proj` matrix: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // #VKRay ubo.viewInverse = glm::inverse(ubo.view); ubo.projInverse = glm::inverse(ubo.proj); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## raygen.rgen It is now time to enrich the ray generation shader to allow it to trace rays. We will first add a new binding to allow the shader to access the camera matrices: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C layout(binding=2, set = 0) uniform CameraProperties { mat4 model; mat4 view; mat4 proj; mat4 modelIT; mat4 viewInverse; mat4 projInverse; } cam; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When tracing a ray, the hit or miss shaders need to be able to return some information to the shader program that invoked the ray tracing. This is done through the use of a payload, identified by the `rayPayloadNV` qualifier. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C layout(location = 0) rayPayloadNV vec3 hitValue; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The `main` function of the shader then starts by the computation of the floating-point pixel coordinates, normalized between 0 and 1. The `gl_LaunchIDNV` contains the integer coordinates of the pixel being rendered, while `gl_LaunchSizeNV` corresdonds to the image size provided when calling `vkCmdTraceRaysNV`. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C void main() { const vec2 pixelCenter = vec2(gl_LaunchIDNV.xy) + vec2(0.5); const vec2 inUV = pixelCenter/vec2(gl_LaunchSizeNV.xy); vec2 d = inUV * 2.0 - 1.0; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From the pixel coordinates, we can apply the inverse transformation of the view and projection matrices of the camera to obtain the origin and direction of the ray. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C vec4 origin = cam.viewInverse*vec4(0,0,0,1); vec4 target = cam.projInverse * vec4(d.x, d.y, 1, 1) ; vec4 direction = cam.viewInverse*vec4(normalize(, 0) ; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In addition, we provide some flags for the ray: a flag indicating, in this case, that all geometry will be considered opaque, a mask that will be binary AND-ed with the mask of the geometry instances. Since all instances have a `0xFF` flag as well, they will all be visible. We also indicate the minimum and maximum distance of the potential intersections along the ray. This allows, for example to use short, inexpensive rays for ambient occlusion computations. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C uint rayFlags = gl_RayFlagsOpaqueNV; uint cullMask = 0xff; float tmin = 0.001; float tmax = 10000.0; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We now trace the ray itself, by first providing `traceNV` with the top-level acceleration structure and the ray masks. The next 3 parameters indicate which hit group would be called when hitting a surface. For example, a single object may be associated to 2 hit groups representing the behavior when hit by a direct camera ray, or from a shadow ray. Since each instance has an index indicating the offset of the hit groups for the instance in the shader binding table, the `sbtRecordOffset` will allow to fetch the right kind of shader for that instance. In the case of the primary rays we may want to use the first hit group and use an offset of 0, while for shadow rays the second hit group would be required, hence an offset of 1. The stride indicates the number of hit groups for a single instance. This is particularly useful if the instance offset is not set when creating the instances in the acceleration structure. A stride of 0 indicates that all hit groups are packed together, and the instance offset can be used directly to find them in the SBT. The index of the miss shader comes next, followed by the ray origin, direction and extents. The last parameter identifies the payload that will be carried by the ray, by giving its location index. The 0 corresponds to our payload definition above, `layout(location = 0) rayPayloadNV vec3 hitValue;`. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C traceNV(topLevelAS, rayFlags, cullMask, 0 /*sbtRecordOffset*/, 0 /*sbtRecordStride*/, 0 /*missIndex*/,, tmin,, tmax, 0 /*payload*/); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Finally, we write the resulting payload into the output buffer. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C imageStore(image, ivec2(gl_LaunchIDNV.xy), vec4(hitValue, 0.0)); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Raster | | Ray-trace :-----------------------------:|:---:|:--------------------------------: ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRasterCube.png width="350px") | <-> | ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRaytraceFlatCube.png width="350px") # Simple Lighting The current closest hit shader only returns a flat color. To add some lighting, we will need to introduce the concept of surface normals. However, the ray tracing only provides the barycentric coordinates of the hit point. To obtain the normals and the other vertex attributes, we will need to find them in the vertex buffer and interpolate them using the barycentric coordinates. This is why we extended the usage of the vertex and index buffers when creating the ray tracing descriptor set. ## closesthit.rchit When we created the ray tracing descriptor set, we already included the geometry definition. Therefore, we can reference the vertex and index buffers directly in the closest hit shader. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C layout(binding = 3, set = 0) buffer Vertices { vec4 v[]; } vertices; layout(binding = 4, set = 0) buffer Indices { uint i[]; } indices; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The vertex buffer is defined as a simple array of `vec4` values. For improved readability, we replicate the vertex structure of the source file: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C struct Vertex { vec3 pos; vec3 nrm; vec3 color; vec2 texCoord; int matIndex; }; // Number of vec4 values used to represent a vertex uint vertexSize = 3; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We can then add a helper function to unpack the data of a given vertex into a `Vertex` structure: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C Vertex unpackVertex(uint index) { Vertex v; vec4 d0 = vertices.v[vertexSize * index + 0]; vec4 d1 = vertices.v[vertexSize * index + 1]; vec4 d2 = vertices.v[vertexSize * index + 2]; v.pos =; v.nrm = vec3(d0.w, d1.x, d1.y); v.color = vec3(d1.z, d1.w, d2.x); v.texCoord = vec2(d2.y, d2.z); v.matIndex = floatBitsToInt(d2.w); return v; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the `main` function, the `gl_PrimitiveID` allows us to find the vertices of the triangle hit by the ray: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C void main() { ivec3 ind = ivec3(indices.i[3 * gl_PrimitiveID], indices.i[3 * gl_PrimitiveID + 1], indices.i[3 * gl_PrimitiveID + 2]); Vertex v0 = unpackVertex(ind.x); Vertex v1 = unpackVertex(ind.y); Vertex v2 = unpackVertex(ind.z); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using the barycentric coordinates, we can interpolate the normal: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C const vec3 barycentrics = vec3(1.0 - attribs.x - attribs.y, attribs.x, attribs.y); vec3 normal = normalize(v0.nrm * barycentrics.x + v1.nrm * barycentrics.y + v2.nrm * barycentrics.z); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The hardcoded directional light source can then be used to compute the dot product of the normal with the lighting direction, giving a simple diffuse lighting effect: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C vec3 lightVector = normalize(vec3(5, 4, 3)); float dot_product = max(dot(lightVector, normal), 0.2); hitValue = vec3(dot_product); } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRaytraceLightGreyCube.png) # Simple Materials The rendering above could be made more interesting by adding support for materials. The imported OBJ objects provide the simplistic Wavefront material definition. ## closesthit.rchit The materials define the basic reflectance properties using simple color coefficients, and also support texturing. The buffer containing the materials has already been created for rasterization, and has also been added into the ray tracing descriptor set. Add binding of the material buffer and the array of texture samplers: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C layout(binding = 5, set = 0) buffer MatColorBufferObject { vec4[] m; } materials; layout(binding = 6, set = 0) uniform sampler2D[] textureSamplers; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ As for the geometry data, the material data is packed into an array of `vec4` values. We declare the material structure and an unpacking helper: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C struct WaveFrontMaterial { vec3 ambient; vec3 diffuse; vec3 specular; vec3 transmittance; vec3 emission; float shininess; float ior; // index of refraction float dissolve; // 1 == opaque; 0 == fully transparent int illum; // illumination model (see int textureId; }; // Number of vec4 values used to represent a material const int sizeofMat = 5; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C WaveFrontMaterial unpackMaterial(int matIndex) { WaveFrontMaterial m; vec4 d0 = materials.m[sizeofMat * matIndex + 0]; vec4 d1 = materials.m[sizeofMat * matIndex + 1]; vec4 d2 = materials.m[sizeofMat * matIndex + 2]; vec4 d3 = materials.m[sizeofMat * matIndex + 3]; vec4 d4 = materials.m[sizeofMat * matIndex + 4]; m.ambient = vec3(d0.x, d0.y, d0.z); m.diffuse = vec3(d0.w, d1.x, d1.y); m.specular = vec3(d1.z, d1.w, d2.x); m.transmittance = vec3(d2.y, d2.z, d2.w); m.emission = vec3(d3.x, d3.y, d3.z); m.shininess = d3.w; m.ior = d4.x; m.dissolve = d4.y; m.illum = int(d4.z); m.textureId = floatBitsToInt(d4.w); return m; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the `main` function, let's start by removing the line writing the output payload `hitValue = vec3(dot_product);`. The `Vertex` structure contains a material index, that we will use to find the corresponding material in the buffer. At the end of the `main` function, add: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C WaveFrontMaterial mat = unpackMaterial(v1.matIndex); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From that material, we can obtain the diffuse reflectance and use it to compute diffuse lighting. Here we also add support for textures to modulate the surface albedo. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C vec3 c = dot_product * mat.diffuse; if (mat.textureId >= 0) { vec2 texCoord = v0.texCoord * barycentrics.x + v1.texCoord * barycentrics.y + v2.texCoord * barycentrics.z; c *= texture(textureSamplers[mat.textureId], texCoord).xyz; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We can now write the payload: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C hitValue = c; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRaytraceLightMatCube.png) ## main For a more interesting model, go to the `main.cpp` file, find the line with ` helloVulkan.loadModel(` and replace it by ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C helloVulkan.loadModel("../media/scenes/Medieval_building.obj"); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Since that model is larger, we can change the `CameraManip.setLookat` call to ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C CameraManip.setLookat(glm::vec3(4.0f, 4.0f, 4.0f), glm::vec3(0, 0, 0), glm::vec3(0, 1, 0)); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRaytraceLightMatMedieval.png) # Shadows The above allows us to raytrace a scene and apply some lighting, but it is still missing shadowing. To this end, we will need to add a new ray type, and shoot rays from the closest hit shaders. This new ray type will require adding a miss shader and a hit group. In the header file, add the indices of the new shaders: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C uint32_t m_shadowMissIndex; uint32_t m_shadowHitGroupIndex; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## createRaytracingPipeline Adding a new ray type means adding a new hit group per geometry, and a new miss shader. !!! Note: Shaders ([Download](/rtx/raytracing/vkrt_helpers/files/ Download the shaders and extract the content to the `shaders` folder. The archive contains one file: `shadowMiss.rmiss`. Add this file in the `shaders` filter of the Visual Studio project. Make sure that its properties page says the `Item Type` is `GLSL Validator`. The shader file should compile, and the resulting SPIR-V file is stored in the `shaders` folder alongside the GLSL files. In the body of `createRaytracingPipeline`, add the definition of the new miss shader right after the previous miss shader: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // The second miss shader is invoked when a shadow ray misses the geometry. It // simply indicates that no occlusion has been found VkShaderModule missShadowModule = VkCtx.createShaderModule(readFile("shaders/shadowMiss.spv")); m_shadowMissIndex = pipelineGen.AddMissShaderStage(missShadowModule); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The hit group for shadow rays is then added after the existing hit group: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // The second hit group defines the shaders invoked when a shadow ray hits the // geometry. For simple shadows we do not need any shader in that group: we will rely on // initializing the payload and update it only in the miss shader m_shadowHitGroupIndex = pipelineGen.StartHitGroup(); pipelineGen.EndHitGroup(); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The pipeline should now allow shooting rays from the closest hit program, which requires increasing the recursion level to 2: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // The ray tracing process can shoot rays from the camera, and a shadow ray can be shot from the // hit points of the camera rays, hence a recursion level of 2. This number should be kept as low // as possible for performance reasons. Even recursive ray tracing should be flattened into a loop // in the ray generation to avoid deep recursion. pipelineGen.SetMaxRecursionDepth(2); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ At the end of the method, we destroy the shader module for the shadow miss shader: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C vkDestroyShaderModule(VkCtx.getDevice(), missShadowModule, nullptr); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## createShaderBindingTable The Shader Binding Table must also be updated, by adding the new miss program after the existing one: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Add the miss shader for the shadow rays m_sbtGen.AddMissProgram(m_shadowMissIndex, {}); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Similarly, we add the new hit group after the one used by primary rays: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Add the hit group defining the behavior upon hitting a surface with a shadow ray m_sbtGen.AddHitGroup(m_shadowHitGroupIndex, {}); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## createTopLevelAS When creating the instances for the top-level acceleration structure, we passed the index of the instance `i` as the index of the corresponding hit group. However, with two ray types, each instance has two hit groups: one representing the behavior when a camera ray hits the geometry, and one used when a shadow ray hits that same geometry. Therefore, in the `AddInstance` call, for the instance `i` we will now use the hit group index `2*i`: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // For each instance we set its instance index to its index i in the instance vector, and set // its hit group index to 2*i. The hit group index defines which entry of the shader binding // table will contain the hit group to be executed when hitting this instance. We set this // index to 2*i due to the use of 2 types of rays in the scene: the camera rays and the shadow // rays. For each instance, the SBT will then have 2 hit groups m_topLevelASGenerator.AddInstance(instances[i].first, instances[i].second, static_cast(i), static_cast(2 * i)); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## createRaytracingDescriptorSet For each resource entry in the descriptor set we indicated which shader stage would be able to use it. Since shadow rays will be traced from the closest hit shader, we replace the acceleration structure binding by: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C // Top-level acceleration structure, usable by both the ray generation and the closest hit (to // shoot shadow rays) m_rtDSG.AddBinding(0, 1, VK_DESCRIPTOR_TYPE_ACCELERATION_STRUCTURE_NV, VK_SHADER_STAGE_RAYGEN_BIT_NV | VK_SHADER_STAGE_CLOSEST_HIT_BIT_NV); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ## closesthit.rchit The closest hit shader now needs to be aware of the acceleration structure to be able to shoot rays: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C layout(binding = 0, set = 0) uniform accelerationStructureNV topLevelAS; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Those rays will also carry a payload, which will need to be defined at a different location from the payload of the current ray. However, in this case, the payload will be a simple floating-point value indicating whether an occluder has been found or not: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C layout(location = 2) rayPayloadNV bool isShadowed; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In the `main` function, instead of simply setting out payload to `hitValue = c;`, we will initiate a new ray. Note that the index of the miss shader is now 1, since the SBT has 2 miss shaders. We also set the offset to 1, indicating that we will want to use the second hit group defined for the instance the ray will hit. The payload location is defined to match the declaration `layout(location = 2)` above. Note that since we did not define any shader in the shadow hit group, no code will be invoked when hitting a surface. Therefore, we initialize the payload `isShadowed` to `true`, and will rely on the miss shader to set it to false if no surface have been encountered. We also use the ray flags to optimize the raytracing: such simplistic shadow rays just need to return when any intersecting surface has been found: the flags then instruct the ray tracing engine to stop the traversal after the first intersection has been found, and not to try to execute a closest hit shader. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C float tmin = 0.001; float tmax = 100.0; vec3 origin = gl_WorldRayOriginNV + gl_WorldRayDirectionNV * gl_HitTNV; isShadowed = true; traceNV(topLevelAS, gl_RayFlagsTerminateOnFirstHitNV|gl_RayFlagsOpaqueNV|gl_RayFlagsSkipClosestHitShaderNV, 0xFF, 1 /* sbtRecordOffset */, 0 /* sbtRecordStride */, 1 /* missIndex */, origin, tmin, lightVector, tmax, 2 /*payload location*/); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The final payload value can then be adjusted depending on the result of the shadow ray: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ C if (isShadowed) hitValue = c * 0.3; else hitValue = c; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ![](/sites/default/files/pictures/2019/vulkan_raytracing/resultRaytraceShadowMedieval.png) The final project can be downloaded [here](/rtx/raytracing/vkrt_helpers/files/ # Going Further From this point on, you can continue creating your own ray types and shaders, and experiment with more advanced ray-tracing-based algorithms. This tutorial was intended to help grasping the concepts of Vulkan ray tracing without going into too much API details. We encourage you to go through the documentation of the [helpers](/rtx/raytracing/vkrt_helpers) to understand how each concept maps to the corresponding API calls.