Simulation / Modeling / Design

NVIDIA Nsight Systems Adds Vulkan Support

Vulkan is a low-overhead, cross-platform 3D graphics and compute API targeting a wide variety of devices from cloud gaming servers, to PCs and embedded platforms. The Khronos Group manages and defines the Vulkan API.

Introduction to NVIDIA Nsight Systems

NVIDIA Nsight™Systems provides developers with a unified timeline view which displays how applications use computer resources. This low-overhead performance analysis tool helps developers analyze and optimize their application’s algorithms, running across the CPU and GPU.

Nsight Systems assists identifying performance issues such as GPU starvation, unnecessary CPU and GPU synchronizations, insufficient CPU parallelization or pipelining, and unexpectedly expensive CPU or GPU algorithms. Nsight Systems employs low-overhead tracing and sampling techniques to collect process and thread activity. It correlates profiling data across CPU cores and GPU queues. This correlation enables developers to investigate bottlenecks from the “scene of the crime” back to the origins of the performance issue.  

Nsight Systems Vulkan support

Nsight Systems release 2019.3 adds Vulkan support for Microsoft Windows and Linux operating systems. The tool can capture information about the profiled process usage of Vulkan, including Vulkan API function execution time, corresponding GPU workloads, debug util labels, and frame durations. Both Windows x64 and Linux operating systems support Vulkan profiling. Figure 1 shows a typical timeline.

Nsight Timeline View image
Figure 1. Nsight Systems timeline view

Nsight Systems Timeline

Nsight Systems displays data in a unified timeline view, combining CPU and GPU activities, together with other activities into a single timeline. The timeline comprises multiple rows, each displaying different data. Let’s dive into the timeline’s rows to see how these can help profile your Vulkan based application.


CPU rows display how your process’ threads utilize each CPU core. This helps locating CPU cores’ idle times, as figure 2 shows.

CPU utilization screen shot
Figure 2. CPU Cores Utilization


Thread rows display a detailed view of each thread’s activities, shown in figure 3. These include CPU cores load, blocked states, call-stacks, OS runtime libraries usage, Vulkan API calls, NVTX time ranges, and Vulkan debug utils labels.

Thread timeline screen shot
Figure 3. An individual thread timeline row

Figure 3 demonstrates threading and synchronization related issues. When a thread is blocked, contextual cues and tooltips present the blocking reason.

Viewing Vulkan debug utils labels or NVIDIA NVTX time ranges and markers on Nsight Systems’ timeline lets developers annotate the timeline with the application logic, as figure 4 shows. The correlation to the application logic greatly improve the timeline’s readability.

Vulkan debug labels screen shot
Figure 4. Vulkan debug labels

Command Buffer Creation

The Command Buffer Creation row shown in figure 5 displays time periods when command buffers creation. This enables developers to improve their application’s multithreaded command buffer creation. Command buffer creation time period is measured between the call to vkBeginCommandBuffer and the call to vkEndCommandBuffer.

Command buffer creation screen shot
Figure 5. Command buffer creation row

Swap Chains

The Swap Chain row seen in figure 6 displays the available swap chains and the time periods where vkQueuePresentKHR was executed on each swap chain.

Swap chain row image
Figure 6. Swap chain row


A Queue row displays for each Vulkan queue created by the profiled application. The API sub-row displays time periods where vkQueueSubmit was called. The GPU Workload sub-row displays time periods where GPU workloads executed.

Queue timeline rows image
Figure 7. Vulkan queue timeline rows

In addition, you can see VK_EXT_debug_utils labels on both the CPU and the GPU.

Debug labels image
Figure 8. Vulkan debug labels

Clicking on a GPU workload highlights the corresponding vkQueueSubmit call, and vice versa.

GPU workloads image
Figure 9. GPU workloads

Stutter Analysis

Nsight Systems displays stutter analysis visualization aids to help trace these “hard to find” frame-rate stutter reasons.

Frames Duration

Figure 10 highlights the Frame Duration section displays frame durations on both the CPU and the GPU.  The colors alternate tones to improve visibility.

Frame duration image
Figure 10. Frame duration

Stutter row

The stutter row highlights frames that are significantly longer than the other frames in their immediate vicinity. You do this by comparing the duration of each frame to the median duration of the surrounding 19 frames. This feature lets developers easily identify local frame-time stuttering. Red indicates local frame stutter greater than 50%, as shown in figure 11.

Frame stutter row screen shot
Figure 11. Frame stutter row

Frame Health row

The Frame Health row displays actions that took significantly longer during the current frame, compared to the median time of the same actions executed during the surrounding 19-frames. This is great at assisting users in detecting the reason for frame time stuttering. Such actions may be: shader compilation, present, memory mapping and more. Nsight Systems measures the accumulated time of such actions in each frame. For example: calculating the accumulated time of shader compilations in each frame and comparing it to the accumulated time of shader compilations in the surrounding 19 frames.

GPU Memory Utilization

The Memory Utilization row shown in figure 12 displays the amount of local GPU memory the application uses on Windows plus the commit limit for each GPU. This helps when investigating cases where the application uses more GPU memory than it’s limit, causing it to fall-off the “GPU memory cliff”; paging out to virtual memory.

Local GPU memory usage screen shot
Figure 12. Local GPU memory usage helps spot overuse of memory which can slow the application when GPU memory pages out to main memory

Getting Started

Nsight Systems is free to download and use. It can be downloaded from Vulkan support requires 64-bit Windows and Linux running on NVIDIA GPUs. You’ll need NVIDIA driver version 418.56 or higher.

Discuss (1)