Technical Blog
Tag: Triton
Subscribe
Technical Walkthrough
Jan 12, 2023
Autoscaling NVIDIA Riva Deployment with Kubernetes for Speech AI in Production
Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process...
13 MIN READ
Technical Walkthrough
Dec 19, 2022
Deploying Diverse AI Model Categories from Public Model Zoo Using NVIDIA Triton Inference Server
Nowadays, a huge number of implementations of state-of-the-art (SOTA) models and modeling solutions are present for different frameworks like TensorFlow, ONNX,...
12 MIN READ
Technical Walkthrough
Dec 08, 2022
Introducing NVIDIA Riva: A GPU-Accelerated SDK for Developing Speech AI Applications
This post was updated from November 2021. Sign up for the latest Speech AI news from NVIDIA. Speech AI is used in a variety of applications, including contact...
8 MIN READ
Technical Walkthrough
Nov 30, 2022
Designing an Optimal AI Inference Pipeline for Autonomous Driving
Self-driving cars must be able to detect objects quickly and accurately to ensure the safety of their drivers and other drivers on the road. Due to this need...
8 MIN READ
Technical Walkthrough
Nov 04, 2022
Deploying a 1.3B GPT-3 Model with NVIDIA NeMo Megatron
Large language models (LLMs) are some of the most advanced deep learning algorithms that are capable of understanding written language. Many modern LLMs are...
11 MIN READ
News
Oct 25, 2022
Run Multiple AI Models on the Same GPU with Amazon SageMaker Multi-Model Endpoints Powered by NVIDIA Triton Inference Server
Last November, AWS integrated open-source inference serving software, NVIDIA Triton Inference Server, in Amazon SageMaker. Machine learning (ML) teams can use...
2 MIN READ