GTC Silicon Valley-2019: OpenSeq2Seq: A Deep Learning Toolkit for Speech Recognition, Speech Synthesis, and NLP

Note: This video may require joining the NVIDIA Developer Program or login

GTC Silicon Valley-2019 ID:S9187:OpenSeq2Seq: A Deep Learning Toolkit for Speech Recognition, Speech Synthesis, and NLP

Boris Ginsburg(NVIDIA),Oleksii Kuchaiev(NVIDIA)
We'll discuss OpenSeq2Seq, a TensorFlow-based toolkit for training deep learning models optimized for NVIDIA GPUs. The main features of our toolkit are ease of use, modularity, and support for fast distributed and mixed-precision training. OpenSeq2Seq provides a large set of state-of-the-art models and building blocks for neural machine translation (GNMT, Transformer, ConvS2S, etc.), automatic speech recognition (DeepSpeech2, Wave2Letter, etc.) speech synthesis (Tacotron2, etc.), and language modeling. All models have been optimized for mixed-precision training with GPU Tensor Cores, and they achieve 1.5-3x training speed-up comparing to float32.

View the slides (pdf)