Nsight Graphics 2023.3
NVIDIA® Nsight™ Graphics 2023.3 is released with the following changes:Feature Enhancements:
- GPU Trace
- Introducing the new Real-Time Shader Profiler in GPU Trace!
- GPU Trace is now able to collect shader performance stats from your live application.
- This feature is available on NVIDIA Ampere and Ada Architectures, for D3D12 and Vulkan applications.
- The GPU Contexts timeline row indicates which graphics/compute context is executing on the GPU at each moment in time.
- Pipeline State Object binding events are now traced, and shown in the Queue | Actions timeline row. This feature is available on D3D12 and Vulkan.
- Compute dispatch events are now traced, and shown in the Queue | Actions timeline row.
- Frame Debugger
- Added support for VK_EXT_shader_object. You can read more about this extension here.
- GPU Trace
- On Windows, the GPU Trace HUD overlay now shows whether allocations intended for VRAM were demoted to SysMem, via the "VRAM Bytes Demoted" indicator. For more information, see the Residency section of Memory Management in Direct3D 12.
- Capture Information now contains System Memory usage and VRAM usage. VRAM usage is decomposed into requested, committed, and demoted bytes. VRAM usage is only available on Windows.
- The GPU Trace HUD overlay is now shown in applications that present on a compute queue.
- A new Capture Screenshot option controls whether the final render target will be saved into the GPU Trace file.
- The "Advanced Mode" metrics have been renamed to "Multi-Pass Metrics", and are usable alongside any set of "Timeline Metrics". Timeline Metrics and Multi-Pass Metrics can be found in the process launch settings.
- Vulkan draw commands are now traced from the following extensions: VK_KHR_draw_indirect_count, VK_NV_mesh_shader, VK_EXT_mesh_shader.
- Vulkan vkCmdTraceRaysIndirect2KHR commands are now traced.
- Improved compatibility with applications that use the DirectX 12 Agility SDK.
- Improved compatibility with applications that use the NVIDIA Streamline SDK.
- The SM Throughput metric now includes SM Tensor Pipe Active.
- FP32 and FP16 performance is now presented accurately in the SM Instruction Throughputs timeline row, by removing the “FMA Light Pipe” and adding the “FMA Pipe”.
- In OpenGL/Vulkan interop applications, shader performance stats are available for Vulkan shaders, via the Real-Time Shader Profiler.
- Shader Profiler
- The Shader Profiler can now be used in GPU Trace, which is referred to as the Real-Time Shader Profiler. This is in addition to the Shader Profiler in the Frame Debugger, which requires a frame capture to view shader performance stats.
- The Shader Profiler for Vulkan can now collect instruction execution counters for 3D and Compute shaders in the Frame Debugger.
- The Shader Profiler views now have enhanced keyboard navigation. Hit Enter to “go to source line” in the source view. In views that have a Call Location column, hit Space to “go to call location” in the source view.
- The Top-Down Calls and Bottom-Up Calls views will display source level statistics when the Aggregate Statistics option is checked. When it is unchecked, per-IL statistics are shown, which may show the same source files and lines multiple times.
- The Top-Down Calls and Bottom-Up Calls views now have filter boxes, allowing you to search for shader functions by name or hash.
- The Shader Pipelines view now has a filter box, allowing you to search for shader pipelines by name or hash.
- The Shader Pipelines view can now display shaders that did not execute. This feature is activated by checking the “Inactive Shaders” checkbox.
- The Shader Profiler now displays the same 64-bit shader hash values as the NVIDIA Nsight Aftermath SDK.
- Source and IL correlation has been improved. We encourage using the latest R535 driver for the best experience.
- GPU Trace
- GPU Trace on Windows: CommandList timeline events may appear to be active for longer than their true duration, when in reality the underlying hardware queue was in a wait state for the initial portion of that time. This only occurs when Windows Hardware Accelerated GPU Scheduling is enabled.
- In the GPU Contexts row, CUDA Contexts may be mislabeled as another API.
- CUDA contexts used for CUDA-D3D12 interop may be labeled as D3D12 Contexts.
- CUDA contexts used for CUDA-OpenGL interop may be labeled as OpenGL Contexts.
- CUDA contexts used for CUDA-Vulkan interop may be labeled as Vulkan Contexts.
- The GPU Contexts row may contain incomplete data in very long traces, or in scenarios with high frequency context switching. If this is observed due to background processes, we recommend disabling those before profiling. If this is observed due to the application’s own behavior, we recommend using fewer contexts to gain better performance, or using Nsight Systems as an alternative.
- Shader Profiler
- The Shader Profiler does not support VkShaderEXT objects.
- The Vulkan Shader Profiler’s support for KHR_non_semantic_info is contingent on shader compiler support. dxc -fspv-debug=vulkan-with-source works well, aside from this compiler bug. In the Vulkan SDK, glslangValidator -gVS does not produce sufficient info at the time of this release.
- In the Frame Debugger, when Collect SASS Execution Counters is true, the Shader Profiler may encounter a Device Lost error on R535 drivers. To work around this, enable Nsight Aftermath before the Frame Debugger session.
- In the Frame Debugger, instruction execution counters are no longer collected by default. To re-enable these counters, we recommend first increasing the Windows TDR Timeout to 30 seconds; then in the Frame Debugger launch settings | Troubleshooting tab, set Collect SASS Execution Counters to “Yes”.
For more details and known issues, please see the full release notes!
For an overview of Nsight™ Graphics and access to resources, please visit the main Nsight™ Graphics page.
NVIDIA® Nsight™ Graphics 2023.3 is available for download under the NVIDIA Registered Developer Program.