Nsight Compute 2026.1 - New Features
Updates in 2026.1.0
- Added support for CUDA 13.2.
- The minimum supported version of macOS is now 13.0.
- The Python Report Interface can now be installed as a standalone package from PyPI using
pip install ncu-report. - Nsight Compute now supports profiling Linux (aarch64 sbsa) targets from macOS hosts.
- The
Start Activity Dialognow support editing multi-line command line arguments. - Added the
Report Merge Tool. It enables you to combine multiple reports into one. It is particularly useful for multi-GPU systems and scenarios when comparing and analyzing several reports individually becomes impractical. - Added the
Clustering Window. It helps you analyze and compare multiple profiling reports by grouping similar reports together. This makes it easier to identify performance patterns and find relationships between different profiling sessions. - Added
Register Dependenciesanalysis to the Source page. It helps you to identify general purpose register dependencies and occupancy issues due to live register pressure. Added Attributed Live Registers and Output Registers metric columns in the Source page views. Renamed Instructions & Scoreboards to Instructions & Dependencies. - The
Source metricspage now shows the metric pipelines associated with each instruction. - Added a
CUDA Graph Viewertool window that dynamically visualizes CUDA Graphs during interactive profiling. Timelineson the Details page now overlay related metrics in a single row when possible. They now also show a max bar in the background to indicate the maximum value at any zoom level. Rows can be switched between theoretical peak and collected maximum value for the Y-axis scale.- The Instruction Statistics section now has thread-level charts.
- Improved alignment of rule elements on the Details page.
- Moved the section body dropdowns to the left of the Details page section headers.
- Renamed “Kernel Analysis” to “SASS Analysis” in Options dialog.
- The Save As dialog now supports the
.ncu-repzfile extension for compressed reports. Mandatory concurrent kernels(e.g. NCCL communication kernels) can now be profiled across processes from the same process tree using the--communicator shmemoption.- The tool now generates a persisted log file in case of non-recoverable errors.
- The default value of
--clock-controlis nowboost. - Fixed issues with OptiX command lists in interactive profiling mode.
- Added
occupancyas an option to--query-metrics-collection. - Fixed issues with the Active Clusters graph in the
occupancy calculator. - Fixed that the inline functions table could show incorrect metrics values in some cases.
- Fixed issues with syntax highlighting on the Souce page.
- Improved the performance of the
Source Comparisonview for large SASS diffs. - Fixed issues with alignment of multi-pass PM sampling
timelineswithout context switch trace. - Using an unknown metric in a PM sampling timeline section file is now an error.
- Fixed issues with opcode category tooltips in the Instruction Statistics charts.
- Fixed issues in node-level profiling of CUDA device launchable graphs.
General
NVIDIA Nsight Compute
NVIDIA Nsight Compute CLI
Resolved Issues
For a complete overview of all NVIDIA® Nsight™ Compute features and access to resources, please visit the main Nsight™ Compute page.
NVIDIA® Nsight™ Compute 2026.1 is available for download under the NVIDIA Registered Developer Program.
References