Sriharsha Niverty

Sriharsha Niverty focuses on AI infrastructure at NVIDIA, optimizing systems-level performance for large-scale LLM inference and training workloads. Previously, he worked on graphics application performance and architecture exploration, with an emphasis on efficient work scheduling inside the GPU.
Avatar photo

Posts by Sriharsha Niverty

Agentic AI / Generative AI

How NVIDIA Extreme Hardware-Software Co-Design Delivered a Large Inference Boost for Sarvam AI’s Sovereign Models

As global AI adoption accelerates, developers face a growing challenge: delivering large language model (LLM) performance that meets real-world latency and cost... 15 MIN READ
A graphic of a computer sending code to multiple stacks.
Simulation / Modeling / Design

Advanced API Performance: Async Compute and Overlap

This post covers best practices for async compute and overlap on NVIDIA GPUs. To get a high and consistent frame rate in your applications, see all Advanced API... 8 MIN READ