Updates in 2023.1.1

General

  • Added support for the CUDA toolkit 12.1 Update 1.
  • Added new configuration options to set the default view mode and precision for the Source page.

Resolved Issues

  • Added support for the DT_RUNPATH attribute when intercepting calls to dlopen. Fixed issue for applications or libraries relying on DT_RUNPATH not finding all dynamic libraries when launched by NVIDIA Nsight Compute.
  • Improved interaction between custom additional metrics and the selected metric set. Adding custom metrics no longer forces switching to the custom metric set.
  • Added ability to gracefully skip folders with insufficient access permissions while importing source code.
  • Fixed the calculation of the peak values for the L1 and L2 cache bandwidths in the hierarchical roofline charts.
  • Fixed issue that prevented modules loaded with function optixModuleCreateFromPTX showing up in the Optix: Modules table of the Resources tool window.
  • Fixed handling of deprecated functions when querying function pointers from the OptiX interception library.
  • Fixed that sometimes sections or rules couldn't be easily selected in the tool window.
  • Fixed issue with Reset Application Data that prevented some setting from correctly resetting.
  • Fixed potential crash of NVIDIA Nsight Compute when Reset Application Data was executed multiple times in a row.
  • Fixed a crash when saving or loading baselines for non-kernel results.
  • Fixed that memory written while executing a CUDA graph was not properly restored in single-pass graph profiling.
  • Fixed potential memory leak while collecting SW counters for modules with unpatched kernel functions.

Updates in 2023.1.0

General

  • Added support for the CUDA toolkit 12.1.
  • Support for OptiX 7.7
  • Added a new app-range replay mode to profile ranges without API capture by relaunching the entire application multiple times.
  • Added sharedBankConflicts sample CUDA application and document to show how NVIDIA Nsight Compute can be used to analyze and identify the shared memory bank conflicts which result in inefficient shared memory accesses. Refer to the README.TXT file, sample code and document under extras/samples/sharedBankConflicts.
  • Jupyter notebook samples are available in the Nsight training github repository.
  • The equivalent of the high-level Python report interface is now available in rule files.

NVIDIA Nsight Compute

  • Added support for profiling individual metrics in Interactive Profile activity. A new input field for metrics was added in the Metric Selection tool window.
  • Files on remote systems can be opened directly from the menu.
  • Metric- and section-related entries in the menu, Profile activity and Metric Selection tool window were renamed to make them more clear.
  • CPU and GPU NUMA topology metrics can be collected on applicable systems. Topology information is shown in a new NUMA Affinity section.
  • Added content-aware suggestions to the Details page to provide suggestions based on the selected profiling options.
  • Added support for re-resolving source files on the Source page.
  • Not-issued warp stall reasons are removed from the Source Counters section tables and hidden by default on the Source page. Users should focus on regular warp stall reasons by default and only inspect not-issued samples if this distinction is needed.
  • Added support to search missing CUDA source files to permanently import into the report using Source Lookup options in the Interactive Profile activity.
  • The source page now shows metric values as percentages by default. New buttons are added to support switching between different value modes.

NVIDIA Nsight Compute CLI

  • Added support for config files in the current working or user directory to set default ncu parameters. See the General options for more details.
  • Added --range-filter command line option which allows to select subset of enabled profile ranges.
  • Added new --source-folders command line option that allows to recursively search for missing CUDA source files to permanently import into the report.

Resolved Issues

  • Fixed performance issues on the Summary and Raw pages for large reports.
  • Improved support for non-ASCII characters in filenames.
  • Fixed an issue with delayed updates of assembly analysis information on the Source page's Source and PTX views.
  • Fixed potential crashes when using the Python report interface.


For a complete overview of all NVIDIA Nsight™ Compute features and access to resources, please visit the main Nsight™ Compute page.

NVIDIA® Nsight™ Compute 2023.1 is available for download under the NVIDIA Registered Developer Program.

 Download 2023.1.1   Download 2023.1.0   Documentation 


PRODUCT INFO

Supported Platforms


Windows Linux Mac DRIVE OS
Host Windows x86_64[1] Linux x86_64[1]
Linux aarch64 sbsa[1]
Linux aarch64 (L4T)[2]
MacOS[1] -
Target Windows x86_64[1] Linux x86_64[1]
Linux PowerPC[1]
Linux aarch64 sbsa[1]
Linux aarch64 (L4T)[2]
- DRIVE OS QNX aarch64[2][3]
DRIVE OS Linux aarch64[2][3]

Host platforms support the Nsight Compute UI for viewing reports, interactive profiling and remote connections. Applications are profiled on target platforms, which also support the Nsight Compute command line interface.


Supported NVIDIA GPU architectures

  • Ada: AD10x
  • Ampere: A100 with Multi-Instance GPU, GA10x
  • Hopper: H100 with Multi-Instance GPU
  • Turing: TU1xx
  • Volta: GV100[1], GV10B[2]

[1] available in this download and the CUDA Desktop Toolkit
[2] available in the Embedded or Drive toolkits only
[3] Only the command line interface (CLI) is provided for these platforms. There is no Nsight Compute GUI application for these platforms


Recommended Drivers

  • NVIDIA Windows Driver - 531.14 or newer
  • NVIDIA Linux Driver Linux - 530.30.02 or newer

We recommend using drivers provided with the most recent CUDA Toolkit production release or a newer version. Older driver versions are also supported.


References