DEVELOPER BLOG
Posts by Davide Onofrio
AI / Deep Learning
Dec 18, 2020
Minimizing Deep Learning Inference Latency with NVIDIA Multi-Instance GPU
Recently, NVIDIA unveiled the A100 GPU model, based on the NVIDIA Ampere architecture. Ampere introduced many features, including Multi-Instance GPU (MIG)…
20 MIN READ
AI / Deep Learning
Oct 05, 2020
Simplifying and Scaling Inference Serving with NVIDIA Triton 2.3
AI, machine learning (ML), and deep learning (DL) are effective tools for solving diverse computing problems such as product recommendations…
12 MIN READ
AI / Deep Learning
Aug 14, 2020
Integrating NVIDIA Triton Inference Server with Kaldi ASR
Speech processing is compute-intensive and requires a powerful and flexible platform to power modern conversational AI applications. It seemed natural to…
13 MIN READ
AI / Deep Learning
May 14, 2020
Introducing NVIDIA Jarvis: A Framework for GPU-Accelerated Conversational AI Applications
This post was updated to include information on the NVIDIA Jarvis open beta. Real-time conversational AI is a complex and challenging task. To allow real-time…
9 MIN READ
AI / Deep Learning
Aug 13, 2019
Real-Time Natural Language Understanding with BERT Using TensorRT
Large scale language models (LSLMs) such as BERT, GPT-2, and XL-Net have brought about exciting leaps in state-of-the-art accuracy for many natural language…
21 MIN READ