Nsight Compute 2022.1 - New Features
Updates in 2022.1.1
General
- Filtering kernel launches or profile results based on NVTX domains/ranges now takes registered strings in the payload field into account, if the range name is empty.
- Added support for the suffix
.max_ratefor ratio metrics.
Resolved Issues
- Fixed a crash during the disassembly of the kernel's SASS code for the Source page.
- Fixed a crash on exit of the NVIDIA Nsight Compute UI.
- Fixed a hang during profiling when CPU call stack collection is enabled.
- Fixed missing to flush UVM buffers before taking memory checkpoints during Range Replay.
- Fixed tracking of memory during Range Replay, if the CUDA context has any device mapped memory allocations.
- Fixed the maximum available shared memory sizes in the Occupancy Calculator for NVIDIA Ampere GPUs.
- Fixed that the shared memory usage of the kernel is incorrectly initialized when opening the Occupancy Calculator from a profile report.
Updates in 2022.1.0
General
- Added support for the CUDA toolkit 11.6.
- Added a new Range Replay mode to profile ranges of multiple, concurrent kernels. Range replay is available in the NVIDIA Nsight Compute CLI and the non-interactive Profile activity.
- Added a new rule to detect non-fused floating-point instructions.
- The Uncoalesced Memory access rules now show results in a dynamic table.
- Unix Domain Sockets and Windows Named Pipes are used for local connection between the host and target processes on x86_64 Linux and Windows, respectively.
- The NvRules API now supports querying action names using different function name bases (e.g. demangled).
NVIDIA Nsight Compute
- The default report page is now chosen automatically when opening a report.
- Added coverage for ECC (Error Correction Code) operations in the L2 Cache table of the Memory Analysis section.
- Added a new L2 Evict Policies table to the Memory Analysis section.
- The Occupancy Calculator now updates automatically when the input changes.
- Added new metric Thread Instructions Executed to the Source page.
- Added tooltips to the Register Dependency columns in the Source page to identify the associated register more conveniently.
- Improved the selection of Sections and Sets in the Profile activity connection dialog.
- NVLink utilization is shown in the NVLink Tables section.
- NVLink links are colored according to the measured throughput.
NVIDIA Nsight Compute CLI
-
--kernel-regexand
--kernel-regex-baseoptions are no longer supported. Alternate options are
--kernel-nameand
--kernel-name-baserespectively, added in 2021.1.0.
- Added support to resolve CUDA source files in the
--pagesource output with the new
--resolve-source-filecommand line option.
- Added new option
--target-processes-filterto filter the processes being profiled by name.
- The CPU Stack Trace is shown in the NVIDIA Nsight Compute CLI output.
- Resolved Issues
- Fixed the calculation of aggregated average instruction execution metrics in non-SASS views on the Source page.
- Fixed that atomic instructions are counted as both loads and stores in the Memory Analysis tables.
For a complete overview of all NVIDIA Nsight™ Compute features and access to resources, please visit the main Nsight™ Compute page.
NVIDIA® Nsight™ Compute 2022.1 is available for download under the NVIDIA Registered Developer Program.
Download 2022.1.1 Download 2022.1.0 Documentation