How CUDA Math Libraries can help you unleash the power of the new NVIDIA A100 GPU

Azzam Haidar, NVIDIA | Harun Bayraktar, NVIDIA

GTC 2020

Part 1: Harun Bayraktar, Senior Manager, CUDA Math Libraries, NVIDIA
Part 2: Azzam Haidar, Senior Math Libraries Engineer, NVIDIA

In the first part of this talk we will focus on how the new features of the NVIDIA A100 GPU can be accessed through the CUDA 11.0 Math libraries. These include 3rd generation tensor core functionality for double precision (FP64), TensorFloat-32 (TF32), half precision (FP16) and Bfloat16 (BF16); as well as increased memory bandwidth, multi-GPU performance improvements, and the hardware JPEG decoder.

In the second part of the talk, we will deep dive into the mixed-precision tensor core accelerated solvers and see how 3rd generation tensor cores can boost many HPC applications (workload) bringing exciting speedups up to 4x on the A100 GPU.

