Updates in 2023.3.1

    General
    • Supports CUDA Toolkit CUDA Toolkit 12.3 Update 1
    • Switched to using OpenSSL version 1.1.1w.
    • Improved the speedup estimates for rule IssueSlotUtilization as well as its child rules.
    • Updated report files and documentation for the samples located at extras/samples/.

    Resolved Issues

    • Fixed collection of context switch data during PM Sampling when using Range Replay.
    • Fixed potential crash of NVIDIA Nsight Compute when an invalid regular expression was provided as requested metric.
    • Improved the performance of NVIDIA Nsight Compute in cases where only a single process is being profiled and --target-processes all was specified.
    • Fixed an issue of reporting too high register counts on the Source Page.
    • Fixed a bug that could cause a GPU fault while collecting SW counters through PerfWorks.
    • Fixed showing incorrect baseline values for the Runtime Improvement values on the Summary Page.

Updates in 2023.3.0

    General
    • Supports CUDA Toolkit CUDA Toolkit 12.3
    • Added support to collect many metrics by sampling the GPU's performance monitors (PM) periodically at fixed intervals. The results can be visualized on a timeline.
    • Added WSL profiling support on Windows 10 WSL with OS build version 19044 and greater. WSL profiling is not supported on Windows 10 WSL for systems that exceed 1 TB of system memory.
    • Rule outputs are prioritized to improve the accuracy of estimated speedups. The Summary page now shows the most actionable optimization advices when a result row is selected.
    • Improved the handling and reporting for unavailable metrics during collection and when applying rules.
    NVIDIA Nsight Compute
    • Added support to see the source files of two profile results side by side using Source Comparison. This allows you to quickly identify source differences and understand changes in metric values.
    • The Summary page is now the default page when a report is opened. Previous behavior can be enabled in the options dialog.
    • On the Summary and Raw pages, values from all/selected rows are automatically aggregated in the column header for applicable metrics. Selected individual cells are aggregated in the bottom status bar.
    • Added Launch Name and Device options in the filter dialog launched by Apply Filters button in the report header. This allows you to now filter results in the Summary and Raw pages.
    • Added support for source view profiles that persist the Source page configuration and allow you to re-apply it to other reports.
    • The Metric Details tool window now supports querying metrics beyond the current report by using the chip:<chipname> tag in the search.
    • Added support for CUDA Graph Edge Data (such as port and dependency type) and CUDA Graph Conditional Handles in the Resources tool window.
    • The Acceleration Structure Viewer and Resources tool window now support OptiX Opacity Micromaps.
    NVIDIA Nsight Compute CLI
    • Tracking and profiling all child processes (--target-processes all) is now the default for ncu.
    • Improved reporting of requested but unavailable metrics. Metrics requested in section files are by default considered optional and only cause a warning to be shown.
    Resolved Issues
    • Support for tracking child processes launched with system() is available on Linux ppc64le.
    • Improved the behavior of following SASS navigation links on the Source page.
    • Fixed issues with profiling CUDA graphs in graph-profiling mode when nodes are associated with a non-current CUDA context.
    • Fixed an issue in L2 bandwidth calculations in the hierarchical roofline sections.

For a complete overview of all NVIDIA Nsight Compute features and access to resources, please visit the main Nsight Compute page.

NVIDIA Nsight Compute 2023.3 is available for download under the NVIDIA Registered Developer Program.

Download 2023.3 Update 1 Download 2023.3 Documentation
References