Driver and New Sample for VK_NVX_device_generated_commands

Recently we introduced the VK_NVX_device_generated_commands (DGC) Vulkan extension, which allows rendering commands to be generated entirely on the GPU. Earlier this week, we added support for VK_NVX_device_generated_commands to our Windows and Linux release drivers. Today we are releasing the ‘BasicDeviceGeneratedCommandsVk’ SDK GameWorks sample. We highly recommend reading the introductory Vulkan Device-Generated Commands article in addition to this blog post.

BasicDeviceGeneratedCommandsVk GameWorks SDK Sample

The sample renders a model split into two parts where each part is a subset of the geometry selected via indexCount and firstIndex. Each part is then rendered using pipeline state objects (PSO) with different polygon modes. Various methods are implemented to render those parts with different amount of work generated on the GPU, as the following table summarizes:

Draw mode Commands generated via API calls Commands generated on device Core Vulkan DrawIndexed VBO/IBO bindings

PSO bindings

Descriptor set bindings

Draw calls

DrawIndirect VBO/IBO bindings

PSO bindings

Descriptor set bindings

Draw calls VK_NVX_device_generated_commands DeviceGeneratedDrawIndirect VBO/IBO bindings

PSO bindings

Descriptor set bindings

Draw calls DeviceGeneratedPsoDrawIndirect VBO/IBO bindings

Descriptor set bindings

PSO bindings

Draw calls

DeviceGeneratedVboIboPsoDrawIndirect Descriptor set bindings

VBO/IBO bindings

PSO bindings

Draw calls



Notes

The DeviceGeneratedDrawIndirect and DrawIndirect modes are functionally equivalent and are intended to show the device generated commands API calls corresponding to core Vulkan indirect draw API calls. The other DeviceGenerated* modes build on this to illustrate more interesting use case of device generated commands.

DeviceGeneratedPsoDrawIndirect changes the PSO from within the token buffer and thus allow both parts of the model to be rendered with a single draw call, which is impossible with (multi) draw indirect

There are new bits for pipeline barrier stages and access that should be used to synchronize access to the buffers used for command generation:

To sync from writing the indirect commands to vkCmdProcessCommandsNVX: srcStage/AccessMask = whatever wrote the buffer dstStageMask = VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX dstAccessMask = VK_ACCESS_COMMAND_PROCESS_READ_BIT_NVX

To sync from writing the indirect commands to vkCmdProcessCommandsNVX: To sync from vkCmdProcessCommandsNVX to vkCmdExecuteCommands: srcStageMask = VK_PIPELINE_STAGE_COMMAND_PROCESS_BIT_NVX srcAccessMask = VK_ACCESS_COMMAND_PROCESS_WRITE_BIT_NVX dstStageMask = VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT dstAccessMask = VK_ACCESS_INDIRECT_COMMAND_READ_BIT

Compute support will be added in a future driver release.

vkGetPhysicalDeviceGeneratedCommandsPropertiesNVX is known to crash with an unextended loader, due to the way physical devices arguments to functions are handled. Here is what’s currently implemented:

Feature/Limit Value computeBindingSupport false maxIndirectCommandsLayoutTokenCount 32 maxObjectEntryCounts 2^31 minSequenceCountBufferOffsetAlignment 256 minSequenceIndexBufferOffsetAlignment 32 minCommandsTokenBufferOffsetAlignment 32

A few things are not implemented in this sample for simplicity; they are however straightforward to add: Generating the token buffers from a shader instead of uploading to the device via vkCmdBufferUpdate Binding descriptor sets from the token buffer instead of binding them via API calls



References

Drivers Windows, version 376.09 or newer Linux,version 375.20 or newer

Headers A future LunarG SDK release is expected to include headers for the extension. In the meantime, definitions and declarations are provided as a part of the sample in vk_nvx_device_generated_commands.h /.cpp



Sample Code

Specifications and Documentation