Posts by Paulius Micikevicius
Data Center / Cloud
Jan 27, 2021
Accelerating AI Training with NVIDIA TF32 Tensor Cores
NVIDIA Ampere GPU architecture introduced the third generation of Tensor Cores, with the new TensorFloat32 (TF32) mode for accelerating FP32 convolutions and...
10 MIN READ
Data Science
Jun 10, 2019
Tips for Optimizing GPU Performance Using Tensor Cores
Our most popular question is "What can I do to get great GPU performance for deep learning?" We’ve recently published a detailed Deep Learning Performance...
13 MIN READ
Data Science
Oct 11, 2017
Mixed-Precision Training of Deep Neural Networks
Deep Neural Networks (DNNs) have lead to breakthroughs in a number of areas, including image processing and understanding, language modeling, language...
9 MIN READ