Updates in 2025.2.0

    General

    • Added support for collecting C2C link information on Blackwell GPUs.
    • Added support for collecting request counts and hit rates for Constant Caches on Turing and newer GPU architectures. The Memory Charts for these architectures was updated to include the new Constant Cache information.
    • CPU call stack filtering now supports Python call stacks.
    • Instruction statistics now show warp- and thread-level instruction counts per opcode category. Added new metrics sass__inst_executed_per_opcode_category and sass__thread_inst_executed_per_opcode_category. See the https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html#metrics-reference">Metrics Reference for details.
    • Enhanced several rules to produce tables pointing to the source location of interest.
    • Improved the NvRules API to support generic tables for the UI and CLI.
    • Improved the NvRules and Python Report Interface documentations to be more pythonic.
    • Added APIs to the Python Report Interface for querying rules and source markers in the report.
    • Added Occupancy Calculator Python Interface, which provides a Python-based interface for performing occupancy calculations and analysis of kernels on NVIDIA GPUs.

    NVIDIA Nsight Compute

    • Added product-wide search functionality via a new search bar and tool window.
    • The Source page now shows scoreboard dependencies in SASS.
    • Converted more tooltips into interactive tooltips. Interactive tooltips can now be pinned and dragged.
    • Added source correlation navigation controls which allow navigation to the previous or next block of correlated lines.

    NVIDIA Nsight Compute CLI

    Resolved Issues

    • CUDA Graphs in the Resources View use the current UI theme.
    • Resolved several issues when interacting with timelines on the Details page.
    • Resolved issues with Python syntax highlighting on the Source page.
    • Disabled deprecated columns in the API Stream tool window.
    • Fixed that the Source page may show incorrect correlation when some source files were not resolved.
    • Reduced the number of replay passes required for collecting the PmSampling.section on GH100 with applicable drivers.
    • Resolved that --native-include did not work properly when using range replay and cu(da)ProfilerStop.
    • Fixed an Invalid or unsupported charset:ANSI_X3.4-1968 error when using the CLI on some systems.
    • Fixed that memory available for saving context state during replay may be computed incorrectly when the app was using managed memory.
    • Fixed that some metrics were not listed for collection in section files for GB20x GPUs.

For a complete overview of all NVIDIA® Nsight™ Compute features and access to resources, please visit the main Nsight™ Compute page.

NVIDIA® Nsight™ Compute 2025.2 is available for download under the NVIDIA Registered Developer Program.

Download 2025.2 Documentation
References