Vikram Sharma Mailthody

Dr. Vikram Sharma Mailthody is part of NVIDIA Research and a co-architect of NVIDIA Dynamo. His work focuses on solving foundational systems-level challenges in emerging data center workloads, with an emphasis on scalable GPU memory and storage system architectures.
Avatar photo

Posts by Vikram Sharma Mailthody

Agentic AI / Generative AI

DynoSim: Simulating the Pareto Frontier

Modern LLM serving is hard to tune because each deployment is a stack of interacting choices: model backend, tensor-parallel shape, prefill/decode split, worker... 12 MIN READ
Top Stories

NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes

The cold-start problem In production inference deployments, demand fluctuates over time, requiring inference replicas to scale elastically. However,... 15 MIN READ
Three icons, with text LLMs, Optimize, Deploy.
Data Center / Cloud

NVIDIA Dynamo Adds GPU Autoscaling, Kubernetes Automation, and Networking Optimizations

At NVIDIA GTC 2025, we announced NVIDIA Dynamo, a high-throughput, low-latency open-source inference serving framework for deploying generative AI and reasoning... 7 MIN READ