NVIDIA® Nsight™ Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging via a user interface and command line tool. In addition, its baseline feature allows users to compare results within the tool. Nsight Compute provides a customizable and data-driven user interface and metric collection and can be extended with analysis scripts for post-processing results.



 Download 2021.3.1      Download 2021.3.0
Version 2021.3 New Features  |  Revision History

NVIDIA® Nsight™ Compute is freely offered through the NVIDIA Registered Developer Program and as part of the CUDA Toolkit


Roofline Analysis

Memory Workload Analysis

Baseline Comparisons

  • Set multiple baselines to compare variations in GPU architecture, kernel launch parameters, memory usage, ...
  • Compare performance metrics between baselines and the current run, including the ability to compare child processes

Run from Nsight Compute GUI or from Console Command Line

  • Nsight Compute GUI provides text for console commands
  • GUI/Console provide similar features, functionality, output, and reports

CUDA Task Graph Profiling

  • Stop at a kernel launch from a graph node
  • State of graph node shown in resource page
  • Export graph visualization

Source Code Correlation

  • Correlate individual Source, SASS, or PTX lines and metrics
  • Shown here with PC Sampling data available in Volta and Turing architectures
  • Heat map for identifying high metric values

Nsight Compute integrates into Visual Studio using NVIDIA Nsight Integration
Visual Studio project settings are transferred to the Nsight Compute

Other Features

  • Interactive kernel profiler
  • Profiler report for kernels and/or child processes
  • Diff’ing results across one or multiple reports using baselines
  • Fast data collection
  • Intuitive UI for interactive profiling
  • Command line operation for manual and automated profiling
  • Fully customizable reports and rules

Variations from the Nsight Compute 2021.3 found in CUDA Toolkit 11.5

  • This version is a reposting of the version in the CUDA ToolKit 11.5.
  • A MacOS host download is available here, but not included in the CUDA Toolkit.
  • We may update this site with bug fixes, as needed.

System Requirements

Supported platforms

    Host
    • Linux x86_64[1]
    • Windows x86_64[1]
    • MacOS[1]
    Target
    • Linux x86_64[1]
    • Windows x86_64[1]
    • Linux PowerPC[1]
    • Linux aarch64 sbsa[1]
    • DRIVE OS QNX aarch64[2][3]
    • DRIVE OS Linux aarch64[2][3]

Supported NVIDIA GPU architectures

  • Ampere: A100 with Multi-Instance GPU, GA10x
  • Turing: TU1xx
  • Volta: GV100[1], GV10B[2]

Drivers

    Please use the following drivers
    • 496.13 (Windows)
    • 495.29.05 (Linux)
    provided with CUDA Toolkit 11.5 production release or a more recent version.
[1] available in this download and the CUDA Desktop Toolkit
[2] available in the Embedded or Drive toolkits only
[3] Only the command line interface (CLI) is provided for these platforms. There is no Nsight Compute GUI application for these platforms

Documentation, Videos, and Blogs

Nsight Compute Documentation
Videos
Blogs

Support

To provide feedback, request additional features, or report Nsight Compute issues, please use the Developer Forums