GTC 2020: GPipe: Efficient Training of Giant Neural Networks Using Pipeline Parallelism
After clicking “Watch Now” you will be prompted to login or join.
Click “Watch Now” to login or join the NVIDIA Developer Program.
GPipe: Efficient Training of Giant Neural Networks Using Pipeline Parallelism
Yanping Huang, Google
Scaling up deep neural network capacity is an effective way to improve model quality for several different machine learning tasks. In many cases, increasing model capacity beyond the memory limit of a single accelerator has required developing special algorithms or infrastructure. These solutions are often architecture-specific, and do not transfer to other tasks. To address the need for efficient and task-independent model parallelism, we'll introduce GPipe, a pipeline parallelism library that allows scaling any network that can be expressed as a sequence of layers. By pipelining different sub-sequences of layers on separate accelerators, GPipe provides the flexibility of efficiently scaling a variety of different networks to gigantic sizes. Moreover, GPipe utilizes a novel batch-splitting pipelining algorithm, resulting in almost linear speedup when a model is partitioned across multiple accelerators.