Updates in 2024.2.0

    • PM sampling timelines now show the sampled GPU workload activities.
    • Added support for collecting Python Call Stacks alongside native ones to better understand the context of a workload in Python applications.
    • Demangled kernel names can now be automatically simplified or manually renamed. For reports with multiple results, all names are considered during simplification to make them easier to distinguish.
    • Removed support for ppc64le.

    NVIDIA Nsight Compute

    • Redesigned the report header for easier access to all report pages. All actions are now sorted into clearly labeled buttons. The focused result selection was integrated into the current row. When adding the current result as a baseline, the row itself is updated to reflect this, instead of showing a separate one.
    • Redesigned Source Page and Source Comparison controls to allow more vertical space.
    • The Source Page and Source Comparison Navigate By dropdowns can be linked to the respective other view. Changing column names from one dropdown will change it in the other view, too. This applies only if a column is also available in the second view.
    • Added Inline Table support in the Source Comparison document to each side separately.
    • Added rich support for Python and Fortran source syntax highlighting. Enhanced CUDA-C and PTX syntax highlighting.
    • Added a Statistics Table to the Source Page that allows you to quickly see aggregated metrics across a custom selection of lines.
    • Improved tooltips in the memory chart to show more detailed information when metrics are missing.
    • The Acceleration Structure Viewer can now compute ray-geometry intersection and traversal timing heatmaps.
    • Added support for ignoring directories in the section search folder.
    • Added support for specifying custom metric descriptions in section files.
    • Added a warning if the opened report is newer than the UI and may not be fully compatible.

    NVIDIA Nsight Compute CLI

    • The raw page csv output now includes metric instance values when these enabled for printing.

    Resolved Issues

    • Improved handling of short workloads during PM sampling.
    • Improved units for several metrics.
    • Fixed an issue that some metrics did not show aggregates on the Summary and Raw pages.
    • Fixed an issue that profiled applications could inadvertently overwrite which PerfWorks library is loaded by the tool.
    • Fixed an issue that kernel names including $ were modified by the shell when profiling them from the System Trace activity.
    • Fixed an issue that reports could not be saved to the expected file extension in certain cases.

For a complete overview of all NVIDIA Nsight Compute features and access to resources, please visit the main Nsight Compute page.

NVIDIA Nsight Compute 2024.2 is available for download under the NVIDIA Registered Developer Program.

Download 2024.2 Documentation