Rakesh Madugundu

Rakesh is an ML performance engineer at Sarvam AI. He focuses on accelerating model inference by optimizing at both the system and kernel levels to reduce production latency. He is passionate about low-level engineering, with a particular interest in writing custom kernels and building foundational architectures from scratch to maximize hardware efficiency.
Avatar photo

Posts by Rakesh Madugundu

Agentic AI / Generative AI

How NVIDIA Extreme Hardware-Software Co-Design Delivered a Large Inference Boost for Sarvam AI’s Sovereign Models

As global AI adoption accelerates, developers face a growing challenge: delivering large language model (LLM) performance that meets real-world latency and cost... 15 MIN READ