Updates in 2025.3.0

    General

    • Added support for CUDA 13.0.
    • Added or improved support for Blackwell chips.
    • Removed support for Volta chips.
    • For Green Context launches, launch__waves_per_multiprocessor is now scaled to the number of SMs in the Green Context.
    • Added support for profiling individual nodes of device-launchable CUDA graphs launched from the host.
    • Added metric launch__persisting_l2_cache_size to the Memory Workload Analysis section.
    • Removed metric profiler__pmsampler_dropped_samples.
    • Added support for not importing SASS cubins into the report.

    NVIDIA Nsight Compute

    NVIDIA Nsight Compute CLI

    • Added the option -forward-signals to transparently forward signals to the profiled application.

    Resolved Issues

    • Fixed that some ncu console messages were truncated after 1024 characters.
    • Fixed some display issues related to Green Context tables.
    • Improved the performance of remote profiling in application replay mode.
    • Fixed a hang in certain scenarios when profiling dependent kernels with device-mapped host allocations.
    • Fixed missing correlation between JIT-compiled PTX to SASS in some situations.
    • Fixed an error when profiling a CUDA graph kernel node doing a cluster launch on driver 580 or newer.

For a complete overview of all NVIDIA® Nsight™ Compute features and access to resources, please visit the main Nsight™ Compute page.

NVIDIA® Nsight™ Compute 2025.3 is available for download under the NVIDIA Registered Developer Program.

Download 2025.3 Documentation
References