Get greater GPU acceleration for deep learning models with Tensor Cores

Learn More

Automatic Mixed Precision for Deep Learning

Deep Neural Network training has traditionally relied on IEEE single-precision format, however with mixed precision, you can train with half precision while maintaining the network accuracy achieved with single precision. This technique of using both single- and half-precision representations is referred to as mixed precision technique.

Benefits of Mixed precision training

  • Speeds up math-intensive operations, such as linear and convolution layers, by using Tensor Cores.
  • Speeds up memory-limited operations by accessing half the bytes compared to single-precision.
  • Reduces memory requirements for training models, enabling larger models or larger minibatches.

  • Enabling mixed precision involves two steps: porting the model to use the half-precision data type where appropriate; and using loss scaling to preserve small gradient values.

    The automatic mixed precision feature in TensorFlow, PyTorch and MXNet provides deep learning researcher and engineers with AI training speedups of up to 3X on NVIDIA Volta and Turing GPUs with adding just a few lines of code.


    Using Automatic Mixed Precision for Major Deep Learning Frameworks


    Automatic Mixed Precision feature is available both in native TensorFlow and inside the TensorFlow container on NVIDIA NGC container registry:


    As an alternative, the environment variable can be set inside the TensorFlow Python script:

    os.environ['TF_ENABLE_AUTO_MIXED_PRECISION'] = '1'

    Automatic mixed precision applies both of these steps, automatic casting and automatic loss scaling, internally in TensorFlow with a single environment variable, along with more fine-grained control when necessary.

    Additionally, for NGC TensorFlow 19.07 or later, and native TensorFlow 1.14 or later, an explicit optimizer wrapper is available:

    opt = tf.train.experimental.enable_mixed_precision_graph_rewrite(opt)

    “TensorFlow developers will greatly benefit from NVIDIA automatic mixed precision feature. This easy integration enables them to get up to 3X higher performance with mixed precision training on NVIDIA Tensor Core GPUs while maintaining model accuracy.”

    — Rajat Monga, Engineering Director, TensorFlow, Google

    “Automated mixed precision powered by NVIDIA Tensor Core GPUs on Alibaba allows us to instantly speedup AI models nearly 3X. Our researchers appreciated the ease of turning on this feature to instantly accelerate our AI.”

    — Wei Lin,Senior Director at Alibaba Computing Platform, Alibaba


    Automatic Mixed Precision feature is available in the Apex repository on GitHub. To enable, add these two lines of code into your existing training script:

    model, optimizer = amp.initialize(model, optimizer, opt_level="O1")

    with amp.scale_loss(loss, optimizer) as scaled_loss:


    Automatic Mixed Precision feature is available both in native MXNet (1.5 or later) and inside the MXNet container (19.04 or later) on NVIDIA NGC container registry. To enable the feature, add the following lines of code to your existing training script:

    with amp.scale_loss(loss, trainer) as scaled_loss:

    Additional Resources