At GTC 2020, NVIDIA announced and shipped a range of new AI SDKs, enabling developers to support the new Ampere architecture. For the first time, developers have the tools to build end-to-end deep learning-based pipelines for conversational AI and recommendation systems.
Announcing NVIDIA Riva, an accelerated SDK for conversational AI services
Today NVIDIA announced Riva, a fully accelerated application framework building multimodal conversational AI services. It includes state-of-the-art DL models, tools for transfer learning and deployment, as well as optimized services that run under 300 ms, the threshold for real time applications, versus 25 seconds on CPU-only systems. Riva integrates several components:
- NeMo: Open-source toolkit to build and fine-tune state-of-the-art conversational AI models.Includes Python module collections to build models easily, supports mixed precision compute to speed up training and fine tuning, and can deploy to Riva services for use in production.
- Megatron-BERT: World’s largest BERT model with state of the art accuracy for reading comprehension with 3.9 B parameters.With Innovations to the model architecture, the model trains efficiently and scales linearly to hundreds of GPUs and the accuracy increases as the model size grows.
- TensorRT 7.1: High performance SDK optimized for NVIDIA A100 GPUs, and includes new optimizations to accelerate BERT inference using INT8 precision. This delivers 6x higher performance than V100 GPUs. Version 7.1 builds on conversational AI capabilities to achieve real-time performance across speech recognition, natural language understanding, and text to speech models.
- Flowtron: State-of-the-art speech synthesis model that generates more realistic and controllable voice expression by sampling from an invertible flow-based model. The model debuted publicly for the first time as the narrator of the newly released I AM AI opening keynote video at GTC Digital 2020. The model is described in a preprint paper available on arXiv.
Join us for our upcoming webinar: Training and Deploying Conversational AI Applications with NeMo and Riva
- Introducing Riva: Framework for GPU-Accelerated Conversational AI Applications
- NVIDIA NeMo: Fast development of speech and language models
- Jumpstart Training for Speech Recognition Models in Different Languages with NeMo
Announcing NVIDIA Merlin – Application Framework for Deep Learning Recommender Systems
Today, NVIDIA announced NVIDIA Merlin, an application framework for building deep learning-based recommendation systems. It includes tools for each stage of the pipeline; NVTabular for feature engineering/preprocessing and HugeCTR for distributed training of popular deep recommender models. Each tool is optimized to support hundreds of terabytes of data, and accessible through easy-to-use APIs.
Join us for our upcoming webinar: HugeCTR: High-Performance Click-Through Rate Estimation Training
Introductory blog: Announcing NVIDIA Merlin: Application Framework for Deep Recommender Systems