Get Started With Nsight Systems

Download NVIDIA Nsight Systems

Nsight Systems 2024.2 is Available Now

Review the supported platforms for NVIDIA Nsight™ Systems to choose the correct version for your host and profiling target.

If profiling from the CLI, pick your platform based on where the CLI will be run. If using the GUI (Full Version) to view reports, do profiling, or do remote profiling, pick your platform based on the host PC architecture where the GUI will be run.

Also review the system requirements before downloading.


Desktop, workstation, and server platforms:



This download is for local and remote profiling of Windows and Linux servers, workstations, and gaming PCs. Profiling is supported on x86-64 architectures.


See the supported platforms for specifics about combinations of local, remote, and mixed-OS compatibilities.


Download:



This download is for local and remote profiling of Windows and Linux servers, workstations, and gaming PCs. Profiling is supported on x86-64 architectures.


See the supported platforms for specifics about combinations of local, remote, and mixed-OS compatibilities.


Nsight Systems 2024.2 Full Version

Nsight Systems 2024.2 CLI Only




Nsight Systems 2024.2 Arm Servers and NVIDIA Grace Full Version

Nsight Systems 2024.2 Arm Servers and NVIDIA Grace CLI Only



Download Nsight Systems 2024.2 macOS Host

This platform only supports viewing reports collected from a CLI or remotely profiling Linux laptops, desktops, workstations, and servers.


See the supported platforms.


JupyterLab integration:



The Nsight Tools JupyterLab Extension allows you to profile Python and other supported languages directly in Jupyter, including detailed analysis with the full Nsight Systems GUI.



Embedded and automotive platforms:



Nsight Systems is bundled as part of the Jetson development suite in the NVIDIA Jetpack™ SDK.



Nsight Systems is bundled as part of DRIVE OS for development and deployment on NVIDIA DRIVE AGX™-based autonomous vehicles.


View Nsight Systems documentation.




Supported Platforms


Nsight Systems is distributed through multiple packages. Pick a “Profiling Target” column and learn what hosts may be used to profile (local or remote) as well as view reports.


Profiling Target
Linux Workstations & Servers Windows Workstations & Gaming PCs NVIDIA DPUs & SuperNICs Jetson & IGX DRIVE
From Host
Windows Remote GUI*
Report Viewer**
Local CLI & GUI Remote GUI*
Report Viewer**
Remote CLI Remote Report Viewer Remote Report Viewer Remote Report Viewer
Mac Remote GUI*
Report Viewer**
Remote Report Viewer** Remote Report Viewer Remote Report Viewer Remote Report Viewer
Linux Local CLI & GUI
Remote GUI*
Report Viewer**
Remote GUI Report Viewer** Remote CLI Remote Report Viewer Remote GUI Report Viewer*** Remote GUI Report Viewer***
DPU / SuperNIC N/A N/A Local CLI N/A N/A
Jetson N/A N/A N/A Local CLI & GUI Report Viewer*** N/A
DRIVE N/A N/A N/A N/A Local CLI

* For x86-64 targets only or opening report collected from a CLI

** Only for reports collected from Windows or Linux PCs & servers of equal or lesser versions

*** Only for reports collected from Jetson or DRIVEOS of equal or lesser versions




System Requirements


Nsight Systems is compatible on Windows workstations and PCs, Linux workstations and servers, as well as Jetson and NVIDIA DRIVE Autonomous Machines. Learn about the system requirements and support for your development platform below.


Windows Workstations and Gaming PCs
Linux Workstations and Servers
Linux Arm Servers
Jetson and Drive Autonomous Machines
Operating Systems Windows 10 or newer
  • Ubuntu 20.04*, and 22.04
  • WSL-Ubuntu 2.0
  • CentOS 7+*
  • RHEL 7, 8, 9
  • SLES 15
  • Debian 10, 11, 12
  • Fedora 37
  • KylinOS 10
  • OpenSUSE 15
  • Rocky 8, 9
  • Ubuntu, 20.04*, and 22.04
  • Rhel 8, 9
  • SLES 15
Jetson Linux
DRIVE OS
Target Hardware GPU: Pascal or newer
CPU: x86-64 processors
GPU: Pascal or newer
CPU: x86-64 processors**
GPU: Pascal or newer
Arm-SBSA servers
NVIDIA IGX, Jetson AGX Orin, Jetson AGX Xavier, Jetson TX2, Jetson TX1, DRIVE AGX Orin, DRIVE AGX Pegasus, DRIVE AGX Xavier, DRIVE PX Parker AutoChauffeur, DRIVE PX Parker AutoCruise
Target Software 64-bit applications only
CUDA 10.0+ for CUDA trace
Driver 418 or newer***
64-bit applications only
CUDA 10.0+ for CUDA trace Driver 418 or newer***
64-bit applications only
CUDA 10.0+ for CUDA trace Driver 418 or newer***
Local Profiling CLI and GUI CLI and GUI CLI and GUI CLI (all platforms), GUI (Jetson Linux only)
Remote Profiling
From Platforms
Windows 10+
macOS 11+
Ubuntu 20.04+
Windows 10+
macOS 11+
Ubuntu 20.04+
N/A Ubuntu 22.04

* For older OS versions, please use Nsight Systems 2020.3
** Intel Haswell architecture or newer is required for LBR sampling backtrace
*** Driver 535 and newer improves GPU profiling stability. Please use the latest driver for the best results. Download here.






Release Notes


2024.2

  • Kubernetes & Helm injection support
    • Available now through NVIDIA NGC Catalog
    • Additional compatible with Azure AKS, AWS EKS, Google GKE, Oracle OKE
  • NVIDIA Nsight Tools JupyterLab Extension for cell profiling
  • Analysis Recipes
    • Communications
      • NVIDIA NIC/HCA Throughputs
      • NCCL identification
      • Compute overlap
    • CUDA Graphs
    • Heatmap overview graphs
    • Diff overview graphs
    • Performance improvements
  • Windows OpenXR trace
    • Trace OpenXR API calls
    • Display xrBeginFrame/xrEndFrame frame boundaries on the timeline
    • Display XR_EXT_debug_utils markers
  • Windows D3D12 Work Graphs trace
  • Windows CPU per-core type info (Performance|Efficient)
  • Windows report finalization performance
  • UX and performance improvements
    • Y-axis labels on the timeline
  • InfiniBand network information generated by ibdiagnet: switch and node names instead of IDs


2024.1

  • System-wide Vulkan trace on Windows (start before target app launch)
  • Windows GPU resource trace enhancements
    • Vulkan support
    • All resource allocations
    • VidMM Device Suspend & Resume
    • Improved sorting
    • Grouping all memory management timeline rows together under VRAM node
  • Windows symbol resolution performance enhancements
  • Windows environment variable editor
  • Video trace support for Linux open GPU kernel module
  • CPU metrics sampling improvements
  • GUI timeline enhancements display row descriptions or Y-axis limit
  • Networking
    • Infiniband switch metric sampling per-port
    • Infiniband switch congestion events per-port
  • UX and performance improvements


2023.4

  • Multi-node analysis enhancements
    • Now supports Mac, Windows x64, Linux Arm Servers
    • Recipe enhancements for NCCL, heatmaps, differencing
  • Unified memory page fault trace for ARM Servers
  • Beta NVIDIA Infiniband switch congestion events on new firmware
  • NVIDIA Grace PMU uncore counter sampling
  • Windows GPU resource trace enhancements for allocations, migrations, Direct3D
  • Python GIL trace
  • UX and performance improvements

2023.3

  • System-wide Direct3D 12 API trace
  • Resource migration trace on Windows
  • NIC metrics profiling from GUI
  • UX and performance improvements

2023.2

  • Python sampling
  • NIC and Switch metrics sampling
  • Multi-report analysis
  • View multiple reports on GUI with merged timelines
  • Support for Opacity Micromaps on Vulkan
  • UX and performance improvements






Feature Table

Feature Linux Workstations and Servers Windows Workstations and Gaming PCs Jetson Autonomous Machines DRIVE Autonomous Vehicles
View system-wide application behavior across CPUs and GPUs
CPU cores utilization, process, & thread activities yes yes yes yes
CPU thread periodic sampling backtraces yes* yes yes yes
CPU thread blocked state backtraces yes** yes yes yes
CPU performance metrics yes no yes yes
GPU workload trace yes yes yes yes
GPU context switch trace yes yes yes yes
SOC hypervisor trace - - - yes
SOC memory bandwidth sampling - - yes yes
SOC Accelerators trace - - Xavier+ Xavier+
OS Event Trace ftrace ETW ftrace ftrace, QNX kernel events
Investigate CPU-GPU interactions and bubbles
User annotations API trace

NVIDIA Tools Extension API (NVTX)
yes yes yes yes
CUDA API yes yes yes yes
CUDA libraries trace (cuBLAS, cuDNN & TensorRT) yes no yes yes
OpenGL API trace yes yes yes yes
Vulkan API trace yes yes no no
Direct3D12, Direct3D11, DXR, & PIX APIs - yes - -
OpenXR - yes - -
OptiX 7.1+ 7.1+ - -
Bidirectional correlation of API and GPU workload yes yes yes yes
Identify GPU idle and sparse usage yes yes yes yes
Multi-GPU Graphics trace OpenGL and Vulkan Direct3D12, OpenGL and Vulkan - -
Trace graphics resource migration between VRAM and System Memory - yes - -
Ready for big data
Fast GUI capable of visualizing in excess of 10 million events on laptops yes yes yes yes
Additional command line collection tool yes no no no
NV-Docker container support yes - - -
NVIDIA GPU Cloud support yes - - -
Minimum user privilege level user administrator root root

* On Intel Haswell and newer CPU architecture
** Only with OS runtime trace enabled. Some syscalls such as handcrafted assembly may be missed. Backtraces may only appear if time threasholds are exceeded.






Archives

Access older versions of Nsight Systems in the Gameworks Download Center.
View older version release notes in the Nsight System’s documentation archive.






Resources

Nsight Systems Documentation

You can also learn about installing & using the NVIDIA Tools Extension API (NVTX) here.

Nsight Tools Tutorial Center

Access the latest resources to get started with Nsight Systems.




Access Self-Paced Training

Nsight Systems Documentation

Get hands on training for Nsight Systems with self-paced online courses from the NVIDIA Deep Learning Institute.

See more courses on Accelerated Computing for Developers.

By Invitation only: Fundamentals of Accelerated Computing with CUDA C/C++

Learn how to optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques.

Learn More

Fundamentals of Accelerated Computing with OpenACC

Learn how to profile applications to identify optimization needs, and more ways to accelerate C/C++ or Fortran applications with OpenACC.

Learn More

Accelerating CUDA C++ Applications with Concurrent Streams

Build robust and efficient CUDA C++ applications that can leverage copy/compute overlap for significant performance gains.

Enroll Now

Scaling Workloads Across Multiple GPUs with CUDA C++

Developer robust and efficient CUDA C++ applications that can leverage all available GPUs on a single node.

Enroll Now

Optimizing CUDA Machine Learning Codes With Nsight Profiling Tools

Use Nsight Systems to analyze overall application structure Nsight Compute to analyze and optimize individual CUDA kernels.

Enroll Now





Tutorial Sessions

 Watch video about profiling GPU applications with Nsight Systems

Profiling GPU Applications with Nsight Systems

This webinar gives an overview of NVIDIA's Nsight profiling tools. It explores how to analyze and optimize the performance of GPU-accelerated applications.

Watch (54:52)
Watch video about fundamentals of ray-tracing development using Nsight graphics and Nsight Systems

Fundamentals of Ray Tracing Development using Nsight Graphics and Nsight Systems

Learn how to utilize Nsight Graphics and Nsight Systems to profile and optimize 3D Applications that are using Ray Tracing.

Watch (2:04:45)
Watch video about optimizing communication with Nsight Systems network profiling

Investigating Hidden Bottlenecks for Multi-Node Workloads

Learn how Nsight Systems can help users identify bottlenecks, investigate their causes, and support developers working at multi-GPU multi-node scales.

Watch (47:21)
NsightSystems GTC

Optimizing Communication with Nsight Systems Network Profiling

Learn how to use Nsight Systems' network profiling capabilities and see how real-world applications utilize GPUs, CPUs, and networking hardware.

Watch (41:45)
Watch video about overcoming pre- and post-processing bottlenecks in ai imaging and CV pipelines with CV-CUDA

Overcoming Pre- and Post-Processing Bottlenecks in AI Imaging and CV Pipelines with CV-CUDA

Watch how Nsight Systems can be used to analyze performance markers and find optimization opportunities for cloud-scale AI.

Watch (42:47)
JWatch video about optimizing HPC simulation and visualization code using NVIDIA Nsight Systems

Optimizing HPC simulation and visualization code using NVIDIA Nsight systems

The NIH Center for Macromolecular Modeling and Bioinformatics used Nsight Systems to achieve a 3x performance increase analyzing large biomolecular systems.

Watch (40:57)





Video Series

Learn about using Nsight Systems for CUDA Development in the CUDA Developer Tools tutorial series.

Watch video about the NVIDIA Nsight Tools Ecosystem

CUDA Developer Tools | NVIDIA Nsight Tools Ecosystem

Watch (4:53)
Watch video introducing Nsight Systems

CUDA Developer Tools | Introduction to Nsight Systems

Watch (9:20)
Watch video about Performance Analysis with the Nsight Systems Timeline

CUDA Developer Tools | Introduction to Nsight Systems

Watch (9:20)
Watch video about optimizing CUDA memory allocations using NVIDIA Nsight Systems

Optimizing CUDA Memory Allocations Using NVIDIA Nsight Systems

Watch (1:25)
Watch video about Nsight Systems command line feature spotlight

Nsight Systems Command Line Feature Spotlight

Watch (1:38)
Watch video about analyzing NCCL usage with NVIDIA Nsight Systems

Analyzing NCCL Usage with NVIDIA Nsight Systems

Watch (1:58)
Watch video about Nsight Systems Feature Spotlight: OpenMP

Nsight Systems Feature Spotlight: OpenMP

Watch (1:19)
Watch video about Nsight Systems: Vulkan Trace

Nsight Systems - Vulkan Trace

Watch (1:28)





Support

To provide feedback, request additional features, or report support issues, please use the Developer Forums .