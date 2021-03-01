Nsight Compute 2021.3 - New Features
Updates in 2021.3.1
Resolved Issues
- Fixed that kernels with the same name and launch configuration were in some scenarios associated with the wrong profiling results during application replay.
- Fixed an issue with binary forward compatibility of the report format.
- Fixed an issue with applications calling into the CUDA API during process teardown.
- Fixed an issue profiling application using pre-CUDA API 3.1 contexts.
- Fixed a crash when resolving files on the Source page.
- Fixed that opening reports with large embedded CUBINs would hang the UI.
- Fixed an issue with remote profiling on a target where the UI is already launched.
Updates in 2021.3.0
General
- Added support for the CUDA toolkit 11.5.
- Added a new rule for detecting inefficient memory access patterns in the L1TEX cache and L2 cache.
- Added a new rule for detecting high usage of system or peer memory.
- Added new
IAction::sass_by_pc functionto the the NvRules API.
- The Python-based report interface is now available for Windows and MacOS hosts, too.
- Added Hierarchical Roofline section files in a new "roofline" section set.
- Added support for collecting CPU call stack information.
NVIDIA Nsight Compute
- Added support for new remote profiling SSH connection and authentication options as well as local SSH configuration files. '
- Added an Occupancy Calculator which can be opened directly from a profile report or as a new activity. It offers feature parity to the CUDA Occupancy Calculator spreadsheet.
- Added new Baselines tool window to manage (hide, update, re-order, save/load) baseline selections.
- The Source page views now support multi-line/cell selection and copy/paste. Different colors are used for highlighting selections and correlated lines.
- The search edit on the Source page now supports Shift+Enter to search in reverse direction.
- The Memory Workload Analysis Chart can be configured to show throughput values instead of transferred bytes.
- The Profile activity now supports the
--devicesoption.
- The NVLink Topology diagram displays per NVLink metrics.
- Added a new tool window showing the CPU call stack at the location where the current thread was suspended during interactive profiling activities.
- If enabled, the Call Stack / NVTX page of the profile report shows the captured CPU call stack for the selected kernel launch.
NVIDIA Nsight Compute CLI
- Added support for printing source/metric content with the new
--pagesource and
--print-sourcecommand line options.
- Added new option
--call-stackto enable collecting the CPU call stack for every profiled kernel launch.
Resolved Issues
- Fixed that
memory_*metrics could not be collected with the
--metricsoption.
- Fixed that selection and copy/paste was not supported for section header tables on the Details page.
- Fixed issues with the Source page when collapsing the content.
- Fixed that the UI could crash when applying rules to a new profile result.
- Fixed that PC Sampling metrics were not available for Profile Series.
- Fixed that local profiling did not work if no non-loopback address was configured for the system.
