GTC 2020: How CUDA Math Libraries can help you unleash the power of the new NVIDIA A100 GPU
After clicking “Watch Now” you will be prompted to login or join.
Click “Watch Now” to login or join the NVIDIA Developer Program.
How CUDA Math Libraries can help you unleash the power of the new NVIDIA A100 GPU
Azzam Haidar, NVIDIA | Harun Bayraktar, NVIDIA
Part 1: Harun Bayraktar, Senior Manager, CUDA Math Libraries, NVIDIA
Part 2: Azzam Haidar, Senior Math Libraries Engineer, NVIDIA
In the first part of this talk we will focus on how the new features of the NVIDIA A100 GPU can be accessed through the CUDA 11.0 Math libraries. These include 3rd generation tensor core functionality for double precision (FP64), TensorFloat-32 (TF32), half precision (FP16) and Bfloat16 (BF16); as well as increased memory bandwidth, multi-GPU performance improvements, and the hardware JPEG decoder.
In the second part of the talk, we will deep dive into the mixed-precision tensor core accelerated solvers and see how 3rd generation tensor cores can boost many HPC applications (workload) bringing exciting speedups up to 4x on the A100 GPU.