Updates in 2022.1.1

General
  • Filtering kernel launches or profile results based on NVTX domains/ranges now takes registered strings in the payload field into account, if the range name is empty.
  • Added support for the suffix .max_rate for ratio metrics.
Resolved Issues
  • Fixed a crash during the disassembly of the kernel's SASS code for the Source page.
  • Fixed a crash on exit of the NVIDIA Nsight Compute UI.
  • Fixed a hang during profiling when CPU call stack collection is enabled.
  • Fixed missing to flush UVM buffers before taking memory checkpoints during Range Replay.
  • Fixed tracking of memory during Range Replay, if the CUDA context has any device mapped memory allocations.
  • Fixed the maximum available shared memory sizes in the Occupancy Calculator for NVIDIA Ampere GPUs.
  • Fixed that the shared memory usage of the kernel is incorrectly initialized when opening the Occupancy Calculator from a profile report.

Updates in 2022.1.0

General
  • Added support for the CUDA toolkit 11.6.
  • Added a new Range Replay mode to profile ranges of multiple, concurrent kernels. Range replay is available in the NVIDIA Nsight Compute CLI and the non-interactive Profile activity.
  • Added a new rule to detect non-fused floating-point instructions.
  • The Uncoalesced Memory access rules now show results in a dynamic table.
  • Unix Domain Sockets and Windows Named Pipes are used for local connection between the host and target processes on x86_64 Linux and Windows, respectively.
  • The NvRules API now supports querying action names using different function name bases (e.g. demangled).
NVIDIA Nsight Compute
  • The default report page is now chosen automatically when opening a report.
  • Added coverage for ECC (Error Correction Code) operations in the L2 Cache table of the Memory Analysis section.
  • Added a new L2 Evict Policies table to the Memory Analysis section.
  • The Occupancy Calculator now updates automatically when the input changes.
  • Added new metric Thread Instructions Executed to the Source page.
  • Added tooltips to the Register Dependency columns in the Source page to identify the associated register more conveniently.
  • Improved the selection of Sections and Sets in the Profile activity connection dialog.
  • NVLink utilization is shown in the NVLink Tables section.
  • NVLink links are colored according to the measured throughput.
NVIDIA Nsight Compute CLI
  • --kernel-regex and --kernel-regex-base options are no longer supported. Alternate options are --kernel-name and --kernel-name-base respectively, added in 2021.1.0.
  • Added support to resolve CUDA source files in the --page source output with the new --resolve-source-file command line option.
  • Added new option --target-processes-filter to filter the processes being profiled by name.
  • The CPU Stack Trace is shown in the NVIDIA Nsight Compute CLI output.
  • Resolved Issues
  • Fixed the calculation of aggregated average instruction execution metrics in non-SASS views on the Source page.
  • Fixed that atomic instructions are counted as both loads and stores in the Memory Analysis tables.
Resolved Issues
  • Fixed the calculation of aggregated average instruction execution metrics in non-SASS views on the Source page.
  • Fixed that atomic instructions are counted as both loads and stores in the Memory Analysis tables.


For a complete overview of all NVIDIA Nsight™ Compute features and access to resources, please visit the main Nsight™ Compute page.

NVIDIA® Nsight™ Compute 2022.1 is available for download under the NVIDIA Registered Developer Program.

 Download 2022.1.1   Download 2022.1.0   Documentation 


PRODUCT INFO