Yongming Ding

Yongming Ding is a senior software engineer at NVIDIA. His work focuses on building LLM inference systems and data platforms for datacenter-scale AI workloads.
Avatar photo

Posts by Yongming Ding

Agentic AI / Generative AI

DynoSim: Simulating the Pareto Frontier

Modern LLM serving is hard to tune because each deployment is a stack of interacting choices: model backend, tensor-parallel shape, prefill/decode split, worker... 12 MIN READ