Posts by Disha Mehra
Generative AI
Dec 18, 2024
NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference
Recurrent drafting (referred as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...
6 MIN READ
Simulation / Modeling / Design
Nov 09, 2021
Building and Deploying Conversational AI Models Using NVIDIA TAO Toolkit
Sign up for the latest Speech AI news from NVIDIA. Conversational AI is a set of technologies enabling human-like interactions between humans and devices based...
25 MIN READ