NVIDIA Nsight Systems

NVIDIA Nsight™ Systems is a system-wide performance analysis tool designed to visualize an application’s algorithms, help you identify the largest opportunities to optimize, and tune to scale efficiently across any quantity or size of CPUs and GPUs, from large servers to our smallest system on a chip (SoC).

Get started
Nsight Systems 2023.1 is available now.

help make high-performance games with beautiful graphics
Learn how Nsight Systems can be used to accelerate development and help make high-performance games with beautiful graphics.

Profile the system.

While the GPU is the engine that drives an application’s graphics, the CPU should operate asynchronously to execute instructions and send tasks to the graphics unit. The same is true for NVIDIA network interface cards (NICs), data processing units (DPUs), and SOCs. The full picture of app optimization is not complete without drilling deeply into these interactions to ensure maximum parallelism is achieved. Nsight Systems visualizes unbiased, system-wide activity data on a unified timeline, allowing application developers to investigate correlations, dependencies, activity, bottlenecks, and resource allocation to ensure hardware components are working harmoniously.

Analyze performance.

Nsight Systems ensures every facet that contributes to an application’s performance can be accessed and analyzed by developers. It offers low-overheard performance analysis that visualizes otherwise hidden layers of events and metrics used for pursuing optimizations, including CPU parallelization and core utilization, GPU streaming-multiprocessor (SM) optimization, system workload and CUDA® libraries trace, network communications, OS interactions, and more.

Scale across platforms.

Nsight Systems is designed to scale across a wide range of NVIDIA platforms, from NVIDIA DGX™ multi-GPU+multi-NIC x86 servers to NVIDIA RTX™ workstations, NVIDIA® GeForce® gaming PCs, NVIDIA Optimus™-enabled laptops, NVIDIA DRIVE® devices with Tegra®+dedicated graphics card (dGPU) multi-OS, and NVIDIA Jetson™. Nsight Systems can even provide valuable insights with real-time stutter analysis in video games, pro-visualization applications, and high-performance computing (HPC) load balancing. These insights can even extend to deep learning and inference, where the load and distribution can be seen in frameworks such as PyTorch and TensorFlow, allowing users to tune their models and parameters to increase single- or multi-GPU utilization.

Explore the key features.

Visualize CPU-GPU interactions.

Nsight Systems latches on to a target application to expose GPU and CPU activity, events, annotations, throughput, and performance metrics in a chronological timeline. With low overhead, this data can be visualized accurately and in parallel for ease of understanding. GPU workloads are further correlated with in-application CPU events, allowing for performance blockers to be easily identified and remedied.

CPU activity (top) in parallel to GPU graphics and compute activity (bottom).
The GPU Metrics section of the Nsight Systems timeline.

Track GPU activity.

To further explore the GPU, toggling on GPU Metrics Sampling will plot low-level input/output (IO) activity such as PCIe throughput, NVIDIA NVLink®, and dynamic random-access memory (DRAM) activity. GPU Metrics Sampling also exposes SM utilization, Tensor Core activity, instruction throughput, and warp occupancy. Every workload and their CPU origin can be readily tracked to support performance tuning.

Trace GPU workloads.

For compute tasks, Nsight Systems supports investigating the CUDA API and tracing CUDA libraries, including cuBLAS, cuDNN, and NVIDIA TensorRT™. For graphics computing, Nsight Systems supports profiling Vulkan, OpenGL, DirectX 11, DirectX 12, DXR, and NVIDIA OptiX™ APIs.

DX12 API calls as they happen chronologically in the timeline alongside render thread.
DX12 API calls as they happen chronologically in the timeline alongside render thread.
Nsight Systems detected a low-health frame causing a large stutter, as well as the calls that caused it.

Detect frame stutter and bottlenecks.

Nsight Systems automatically detects slow frames (by highlighting frame times higher than a target) as well as local stutter frames (by highlighting frames with higher times than neighboring frames). It also automatically reports CPU times per frame and API calls that are likely candidates for causing stutters. This equips developers with plenty of information to locate and resolve the causes of frame drops and inconsistent frame timing.


Read more about using Nsight Systems to fix stutters in games

Experience a tool built for flexibility.

Nsight Systems has a GUI, but it can also be used through the command line. For those who prefer automation, features like the Expert System provide automatic analysis to uncover performance blockers and recommend fixes. Nsight Systems can be leveraged by a solo hobbyist or a full team of engineers, from the scale of servers down to SoCs. Nsight Systems is built for every developer.

Nsight Systems’ command-line interface (CLI) being used for application profiling.

Check out partner testimonials and ecosystem.

adobe

"Vulkan is the cornerstone of Adobe’s multi-platform, multi-vendor rendering strategy for its Adobe Substance 3D products. Thanks to the ray-tracing extensions that NVIDIA pioneered and contributed to Khronos, Vulkan gives native access to ray-tracing hardware, offering exceptional ray-tracing performance on supported devices. In addition, Nsight Graphics and Nsight Systems are invaluable tools when it comes to understanding and improving the performance of Vulkan ray-tracing applications."



— Francois Beaune, Lead Software Engineer of Photorealistic Rendering, Adobe 3D and Immersive

microsoft

"NVIDIA Nsight Systems has enabled the Microsoft Azure HPC+AI team to perform detailed analysis and optimize GPU-accelerated AI and software for our services and customers. The tool paints a clear picture of events on the CPUs, GPUs, NICs, and OS, which have allowed us to quickly identify the top time-consuming functions and cold spots to target."

— Kushal Datta, Principal Software Engineer, Microsoft Azure HPC+AI





"We noticed that our new Quadro P6000 server was ‘starved’ during training, and we needed experts for supporting us. NVIDIA Nsight Systems helped us to achieve over 90 percent GPU utilization. A deep learning model that previously took 600 minutes to train, now takes only 90."

— Felix Goldberg, Chief AI Scientist, Tracxpoint

adobe
autodesk
Dassault Systèmes
epic games
maxon
popcorn fx
logo ubi soft

Deepset achieves a 3.9X speedup and 12.8X cost reduction for training natural language processing (NLP) models by working with AWS and NVIDIA.

Learn More

Watch Nsight Systems sessions and technical videos on demand.

Stay up to date on the latest Nsight Systems news.

GPU-Accelerated Video Processing with NVIDIA In-Depth Support for Vulkan Video

January 30, 2023

GPU-Accelerated Video Processing with NVIDIA In-Depth Support for Vulkan Video

CUDA Toolkit 12.0 Released for General Availability

December 12, 2022

CUDA Toolkit 12.0 Released for General Availability

Just Released: CUDA Toolkit 12.0

December 8, 2022

Just Released: CUDA Toolkit 12.0

Upcoming Workshop: Fundamentals of Accelerated Computing with CUDA C/C++

December 5, 2022

Upcoming Workshop: Fundamentals of Accelerated Computing with CUDA C/C++

Keep Up with the Latest in NVIDIA Game Development

Ready to get started with NVIDIA Nsight Systems?

Download Now