Explore NVIDIA Riva benefits.
Built on State-of-the-Art NVIDIA AI
Riva is part of the NVIDIA AI platform, which has been built on a decade of AI innovations by NVIDIA across model architectures, training techniques, inference optimizations, and deployment solutions.
Flexibility at every step, from modifying model architectures to fine-tuning models on your data and customizing pipelines, as well as the ability to deploy on any platform.
Continued optimizations across the entire stack from models to software to hardware delivered 12X the gain versus the previous generation.
World-Class Speech AI
As speech-based applications are adopted globally, solutions need to interact with humans across many languages. Speech AI apps need to understand industry specific jargon and respond naturally in real-time. Riva includes world-class automatic speech recognition (ASR) and text-to-speech (TTS) that runs in real time.
Try NVIDIA Riva automatic speech recognition.
In this demo, Riva ASR delivers highly accurate transcription in real time.
You can provide an input through your microphone or upload a .wav file from your device.
The duration of each sample is limited to 30 seconds.
Try NVIDIA Riva Text-to-Speech.
If you’re looking to add voice to your interactive virtual assistant, modern home device, or reading assistant for people with a reading disability or visual impairment, try Riva’s out-of-the-box (OOTB) English female or male voice.
Hear the human-like expressive voices created using Riva’s state-of-the-art (SOTA) neural speech synthesis models.
What Is NVIDIA Riva?
Simple end-to-end workflow for speech AI applications.
- Pretrained speech AI SOTA models: ASR and TTS models fully customizable for datasets and accelerating the development of domain-specific models by 10X.
- High-performance inference: Inference is powered by NVIDIA TensorRT™ optimizations and served using the NVIDIA Triton™ Inference Server, both components of the NVIDIA AI platform.
- Riva services: These are available as gRPC-based microservices for low-latency streaming and high-throughput offline use cases.
- High scalability: Fully containerized, Riva can easily scale to hundreds and thousands of parallel streams.
Learn more about Riva ASR.
Speech recognition technology enables voice search on the internet, hands-free computing, voice commands to smart home devices and in-car assistants, medical note taking, contact center 24/7 virtual assistants, and phone call and video conferencing transcriptions for pattern and trends analytics. NVIDIA Riva automatic speech recognition (ASR) delivers world-class, accurate transcripts based on GPU-optimized models, fully customizable for any domain or deployment platform.
Key features of Riva ASR include:
- Support for English, Spanish, Mandarin, Hindi, Russian, German, and French
- Out-of-the-box models trained on a variety of domain-specific data for hundreds of thousands of hours on NVIDIA GPUs
- Best-possible accuracy for different languages, accents, domains, vocabulary, and context by fine-tuning vocabulary, lexicon, acoustic, language, punctuation and inverse text normalization models
- The ability to return streaming transcripts with automatic punctuation and world-level timestamps for hundreds of thousands of input audio streams
- Word/Profanity filtering with customizable and effective offensive spoken words removal
Learn more about Riva TTS.
Text-to-speech produces voices that narrate e-books and documents, converse with humans as smart assistants or digital avatars, and are part of nearly all digital devices, including smartphones, tablets, and laptops. NVIDIA Riva text-to-speech (TTS) provides human-like synthetic voices based on state-of-the-art spectrogram generation and vocoder models. TTS pipelines are customizable and GPU-optimized to run efficiently in real time.
Key features of Riva TTS include:
- SOTA models for generating expressive, human-like voices
- Two out-of-the-box professional female and male voices for US English
- Easy voice and accent fine-tuning with pitch, volume, and duration control for expressivity
- 12X higher inference performance versus existing technologies
Fast-Track Your Riva Journey with NVIDIA LaunchPad
Get immediate access to NVIDIA Riva with free curated labs. Access step-by-step guided labs for speech AI with ready-to-use software, sample data, and applications.
Read customer stories.
With NVIDIA Riva, RingCentral achieved unparalleled real-time transcription accuracy for video meetings, serving millions of users with diverse accents & domain-specific jargon globally.
T-Mobile uses NVIDIA Riva ASR in their call center to accurately transcribe customer conversations and provide real-time recommendations to agents for quickly resolving customer queries.
Tarteel uses NVIDIA Riva and NVIDIA NeMo to provide real-time feedback on Quran recitation at scale, enabling Muslims, instructors, content creators, and researchers to engage with the Quran.
Data Monsters added a speech pipeline for Plabook app using NVIDIA Riva to help students read, assess the accuracy at phoneme-level and provide individualized feedback.
Explore more resources.
Get an introduction.
Understand the key features in Riva that help you build speech AI services.
Explore the starter kit.
Get everything you need to start developing your speech AI with NVIDIA Riva, including tutorials, Jupyter Notebooks, and documentation.
NVIDIA Riva is available from the NVIDIA GPU Cloud for members of the NVIDIA Developer Program.