Graphic with computer, cloud, and GPU icons
Technical Walkthrough 2

Autoscaling NVIDIA Riva Deployment with Kubernetes for Speech AI in Production

Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process... 13 MIN READ
Technical Walkthrough 11

Deploying Diverse AI Model Categories from Public Model Zoo Using NVIDIA Triton Inference Server

Nowadays, a huge number of implementations of state-of-the-art (SOTA) models and modeling solutions are present for different frameworks like TensorFlow, ONNX,... 12 MIN READ
Technical Walkthrough 3

Introducing NVIDIA Riva: A GPU-Accelerated SDK for Developing Speech AI Applications

This post was updated from November 2021. Sign up for the latest Speech AI news from NVIDIA. Speech AI is used in a variety of applications, including contact... 8 MIN READ
Technical Walkthrough 6

Designing an Optimal AI Inference Pipeline for Autonomous Driving

Self-driving cars must be able to detect objects quickly and accurately to ensure the safety of their drivers and other drivers on the road. Due to this need... 8 MIN READ
Technical Walkthrough 5

Deploying a 1.3B GPT-3 Model with NVIDIA NeMo Megatron

Large language models (LLMs) are some of the most advanced deep learning algorithms that are capable of understanding written language. Many modern LLMs are... 11 MIN READ
News 2

Run Multiple AI Models on the Same GPU with Amazon SageMaker Multi-Model Endpoints Powered by NVIDIA Triton Inference Server

Last November, AWS integrated open-source inference serving software, NVIDIA Triton Inference Server, in Amazon SageMaker. Machine learning (ML) teams can use... 2 MIN READ