This page contains instructional videos for NVIDIA® Nsight™ Compute. These videos are a great resource for enhancing your understanding of all the features Nsight Compute has to offer.

GPU Technology Conference 2021: Nsight Compute 2021.1 - Resource Tracking for new CUDA Toolkit 11.3 Resources

This GTC 2021 spotlight highlights Nsight Compute's support for CTK11.3 resources with the Nsight Compute Resource Tracker, including:

  • User Objects
  • Memory Allocations
  • Memory Pools
  • Graph Nodes

Presented 04-12-2012 | GTC 2021: Nsight Compute 2021.1 (CUDA 11.3) | New in 2021.1 | View on YouTube

GPU Technology Conference 2021: CUDA is Evolving, and the Latest Developer Tools are Adapting to Keep Up

As the CUDA ecosystem rapidly expands, developers need the latest tools (including Nsight Compute) to make sure they can debug, profile, and optimize to take advantage of it all. Nsight developer tools along with other CUDA debuggers, profilers, and checkers are adding new features all the time to help. We'll go over the latest features in developer tools (including Nsight Compute) with a focus on the problems they help you solve. You'll learn about updated performance metrics for the latest architecture, new workflows that enable more usage models, and usability features to get you the information you're looking for, faster. This session is relevant to anyone developing for CUDA platforms, whether it's the first time you've picked up the tools or you're a power user looking to learn the latest and greatest new features.

Presented 04-12-2012 | GTC 2021: Nsight Compute 2021.1 (CUDA 11.3) | New in 2021.1 | View on YouTube

Supercomputing 2020: Nsight Compute 2020.2 - New Profiling mode: Application Replay

This Supercomputing 2020 spotlight introduces the new Application Replay mode, which complements the Kernel Replay mode.

First, you'll get an overview of the Nsight Compute profiling and collection process, diving into the normal 'Kernel' replay mode. Then, we'll look at situations when using the new 'Application' replay mode is advantageous.

Presented 11-09-2020 | Supercomputing 2020: Nsight Compute 2020.2 (CUDA 11.1) | New in 2020.2 | View on YouTube

Supercomputing 2020: Nsight Compute's Roofline and NVIDIA Ampere GPU Architecture Analysis

This Supercomputing 2020 spotlight reviews Nsight Compute's Roofline Analysis tool as well as new analysis features for NVIDIA's Ampere GPU architecture.

We will show how Roofline analysis provides a graphical view of how a CUDA kernel’s Arithmetic Intensity and FLOPS performance. Using this analysis, it's easy to see how your kernel's performance compares to hardware constraints and indicates how much room there is for improvement and the kind of optimizations that can be employed to improve performance.

Next, we will explore how Nsight Compute allows you to monitor the throughput of the NVIDIA Ampere architecture's CUDA Asynchronous Copy feature.

Presented 11-11-2020 | Supercomputing 2020: Nsight Compute 2020.2 (CUDA 11.1) | New in 2020.1 | View on YouTube

Nsight Compute 2020.1 Spotlight

This NVIDIA Nsight Compute 2020.1 release spotlight highlights these new features:

  • Roofline Analysis for Visualization of Performance Headroom
  • NVIDIA Ampere Architecture metrics
    • Asynchronous Copy to Shared Memory
    • Sparse Data Compression

Nsight Compute Overview | New in 2020.1 available 2020/05/28 (CUDA 11.0)| View on YouTube

GTC 2020 Lab: Modern CUDA Programming Hazards and the Linux Nsight Toolbox to Fix Them

In this hands-on lab, you'll learn from NVIDIA developers and experts about efficiently debugging, profiling, and optimizing CUDA applications on Linux. Through a set of exercises, you'll use the latest features in NVIDIA's suite of tools to detect and fix common issues of correctness and performance in their applications.

Presented 05-21-2020 | GTC 2020: Nsight Compute 2020.1 (CUDA 11.0) | View on DevZone | Lab Materials on GitHub

GTC 2020: Optimizing CUDA Kernels in HPC Simulation and Visualization Codes Using NVIDIA Nsight Compute 2020.1

NVIDIA engineers and the developers of molecular modeling tools at University of Illinois will share their experiences using NVIDIA Nsight Compute to analyze and optimize several CUDA/Optix kernels in HPC applications, such as VMD and NAMD. This presentation highlights several intermediate and advanced kernel profiling techniques and show you how to iteratively identify bottlenecks and improve your kernel performance. You'll also get an overview of NVIDIA Nsight Compute 2020.1 features including support for the new NVIDIA Ampere architecture and new Roofline Analysis

Presented 05-21-2020 | GTC 2020: Nsight Compute 2020.1 (CUDA 11.0) | View on DevZone

Blue Waters Webinar 2019: Introduction to NVIDIA Nsight Compute - A CUDA Kernel Profiler

Understanding and optimizing the runtime behavior of your code can be a challenging effort but is often rewarded with significant performance gains. NVIDIA Nsight Compute is a CUDA kernel profiler that provides detailed performance data and offers guidance for optimizing your CUDA kernels. You'll learn about how to collect a wide range of performance data for your CUDA kernels, how automatic rules help in detecting common performance pitfalls and offering guidance through the profile reports, how to quickly compare profiling results to evaluate the effects of your code changes, and how to customize the tool to fit best to your optimization workflow

Presented 11-06-2019 | GTC 2020: Nsight Compute 2019.4 (CUDA 10.2) | View on

GTC Silicon Valley-2019 ID:S9345:CUDA Kernel Profiling Using NVIDIA Nsight Compute

Learn about NVIDIA's developer tool, Nsight Compute, for optimizing your CUDA kernels. Nsight Compute is an interactive kernel profiler for CUDA applications that provides detailed performance metrics and API debugging via a user interface and command line tool. In addition, its baseline feature allows users to compare results within the tool. We will explain how Nsight Compute provides a customizable and data-driven user interface and metric collection and can be extended with analysis scripts for post-processing results.
View the slides (pdf)

Presented March 2019 | GTC 2020: Nsight Compute 2019.1 (CUDA 10.1) | View on DevZone

SIGGRAPH 2018: OptiX Profiling with Nsight Compute

In this hands-on live demo, we'll show how NSIGHT Compute can be used to profile applications built with NVIDIA OptiX. We'll identify perfomance bottlenecks in several OptiX applications and identify the key differences between vanilla CUDA programs and OptiX applications from a profiling perspective. We'll also demonstrate how to customize NSIGHT Compute to extract and present profiling information in the way that is most suitable for a given OptiX application. This talk will contain almost no slides and instead focus on live usage of the tools involved.

Presented Aug 15 2018 | SIGGRAPH2018: Nsight Compute 1.0 (CUDA 10.0) | View on


Learn about NVIDIA's developer tools for optimizing your CUDA kernels. We will cover the latest updates to the Visual Profiler, nvprof, and Nsight and we will introduce NVIDIA's next generation of kernel profiling tools. See our new tools in action providing a consistent experience across all platforms, lower performance overhead, and ways to customize our tools to your needs.

Presented March 29, 2018 | GTC2018: Nsight Compute Preview (CUDA 9.2) | View on

 Download   Documentation