GTC Silicon Valley-2019: Mixed Precision Training of Deep Neural Networks
GTC Silicon Valley-2019 ID:S9143:Mixed Precision Training of Deep Neural Networks
Mixed precision training of deep neural networks provides tremendous benefit. It requires half the storage and data movement of single-precision values, and starting with the Volta GPU's Tensor Cores, provides up to 120 TFLOPS of math throughput, an 8X speedup over FP32. In this tutorial we'll first present the considerations and techniques when training with reduced precision, including master weights and automatic loss scaling. After, we'll discuss real-world training in mixed precision with a particular focus on the PyTorch and TensorFlow frameworks.