Nsight Compute 2025.4 - New Features

General Added support for profiling CUDA tile workloads.



Introduced a new Tile section to summarize tile dimensions and pipeline utilization, displayed when enabled and a tile workload is profiled.



Source page supports correlation between SASS and high-level Tile code (limited to cuTile Python code).



Added a new ncu-repz file format for zstd compressed report files.

file format for zstd compressed report files.

Added support for locking GPUs to boost clock instead of base on Ampere and newer GPU. Use the boost and force-boost options on supported drivers.

and options on supported drivers.

Warp sampling by default now focuses on the Not Issued (( _not_issued )) variants of the metrics. This is to avoid pointing to source locations where warp stalls are mitigated by having sufficient numbers of warps during an issue cycle to hide latency.

(( )) variants of the metrics. This is to avoid pointing to source locations where warp stalls are mitigated by having sufficient numbers of warps during an issue cycle to hide latency.

Added support for node-level profiling of CUDA conditional graphs, including device-updatable nodes and nodes that can set conditional graph handles.



Added support for node-level profiling of CUDA graphs launched from the device (DGL), including host graph nodes that can launch DGL.



Source page now displays symbol labels: A new column for symbol labels has been added, and symbol labels are shown alongside addresses in SASS instruction disassembly. This change aligns the output with that of the nvdisasm tool.



Added support for collecting Warp sampling metrics with PM sampling allowing user to see function-level warp stalls for the selected time range in the timeline. See the Function Stats tool window for details. NVIDIA Nsight Compute NVIDIA Nsight Compute CLI Resolved Issues

For a complete overview of all NVIDIA® Nsight™ Compute features and access to resources, please visit the main Nsight™ Compute page.

NVIDIA® Nsight™ Compute 2025.4 is available for download under the NVIDIA Registered Developer Program.

References