Get greater GPU acceleration for deep learning models with Tensor Cores

Learn More

Automatic Mixed Precision for Deep Learning

Deep Neural Network training has traditionally relied on IEEE single-precision format, however with mixed precision, you can train with half precision while maintaining the network accuracy achieved with single precision. This technique of using both single- and half-precision representations is referred to as mixed precision technique.

Benefits of Mixed precision training

  • Speeds up math-intensive operations, such as linear and convolution layers, by using Tensor Cores.
  • Speeds up memory-limited operations by accessing half the bytes compared to single-precision.
  • Reduces memory requirements for training models, enabling larger models or larger minibatches.

  • Nuance Research advances and applies conversational AI technologies to power solutions that redefine how humans and computers interact. The rate of our advances reflects the speed at which we train and assess deep learning models. With Automatic Mixed Precision, we’ve realized a 50% speedup in TensorFlow-based ASR model training without loss of accuracy via a minimal code change. We’re eager to achieve a similar impact in our other deep learning language processing applications.

    Wenxuan Teng, Senior Research Manager, Nuance Communications

    Enabling mixed precision involves two steps: porting the model to use the half-precision data type where appropriate; and using loss scaling to preserve small gradient values.

    The automatic mixed precision feature in TensorFlow, PyTorch and MXNet provides deep learning researcher and engineers with AI training speedups of up to 3X on NVIDIA Volta and Turing GPUs with adding just a few lines of code.


    Using Automatic Mixed Precision for Major Deep Learning Frameworks


    Automatic Mixed Precision feature is available both in native TensorFlow and inside the TensorFlow container on NVIDIA NGC container registry:


    As an alternative, the environment variable can be set inside the TensorFlow Python script:

    os.environ['TF_ENABLE_AUTO_MIXED_PRECISION'] = '1'

    Automatic mixed precision applies both of these steps, automatic casting and automatic loss scaling, internally in TensorFlow with a single environment variable, along with more fine-grained control when necessary.

    Additionally, for NGC TensorFlow 19.07 or later, and native TensorFlow 1.14 or later, an explicit optimizer wrapper is available:

    opt = tf.train.experimental.enable_mixed_precision_graph_rewrite(opt)

    “TensorFlow developers will greatly benefit from NVIDIA automatic mixed precision feature. This easy integration enables them to get up to 3X higher performance with mixed precision training on NVIDIA Tensor Core GPUs while maintaining model accuracy.”

    — Rajat Monga, Engineering Director, TensorFlow, Google

    “Automated mixed precision powered by NVIDIA Tensor Core GPUs on Alibaba allows us to instantly speedup AI models nearly 3X. Our researchers appreciated the ease of turning on this feature to instantly accelerate our AI.”

    — Wei Lin,Senior Director at Alibaba Computing Platform, Alibaba


    Automatic Mixed Precision feature is available in the Apex repository on GitHub. To enable, add these two lines of code into your existing training script:

    model, optimizer = amp.initialize(model, optimizer, opt_level="O1")

    with amp.scale_loss(loss, optimizer) as scaled_loss:


    Automatic Mixed Precision feature is available both in native MXNet (1.5 or later) and inside the MXNet container (19.04 or later) on NVIDIA NGC container registry. To enable the feature, add the following lines of code to your existing training script:

    with amp.scale_loss(loss, trainer) as scaled_loss:

    Additional Resources