Learn what’s new in the latest releases of cuDNN, CUDA, TensorRT, DALI, and Nsight Compute.
cuDNN 7.5
The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. This version of cuDNN includes:
- Multi-head attention for accelerating popular models such as Transformer
- Improved depth-wise separable convolution for training models such as Xception and Mobilenet
CUDA 10.1 Update 1
CUDA is the parallel computing platform and programming model for general-purpose computing on NVIDIA GPUs. This version of CUDA includes:
- Improved SpMM/SpMV kernel performance in cuSPARSE for sparse applications in high-performance computing (HPC) and machine learning
- Graph API support in cuFFT to allow use of FFT kernels in CUDA Graphs
- Extended data type support and improved heuristics in cuBLAS for all HPC and machine learning applications
- APIs for stream parsing, memory control, decoding, and multi-channel bitstreams in nvJPEG
TensorRT 5.1 GA
NVIDIA TensorRT is a platform for high-performance deep learning inference. This version of TensorRT includes:
- Optimization of new models such as DenseNet and TinyYOLO with support for over 20 new layers, activations, and operations in TensorFlow and ONNX
- The ability to speed up deployment by updating model weights in an existing engine without rebuilding them
- Application deployment in INT8 precision on NVIDIA Xavier™-based NVIDIA AGX™ platforms using the NVIDIA DLA accelerator
DALI 0.9
The NVIDIA Data Loading Library (DALI) is a portable, open-source GPU-accelerated library for decoding and augmenting images and videos to accelerate deep learning applications. This version of DALI includes:
- Video pre-processing workflow support with new optical flow operator
- ROI-based JPEG decoding for high-resolution images
Nsight Compute 2019.3
NVIDIA Nsight Compute is an interactive kernel profiler for CUDA applications. This version of Nsight compute includes:
- Support for CUDA 10.1 Update 1 and the latest NVIDIA Turing GPUs
- Expanded PC sampling and NVTX control
- Workflow and UI improvements along with extended command line switches