Hongkuan Zhou

Dr. Hongkuan is a senior Deep Learning Algorithm Engineer. His work focuses on developing efficient and scalable LLM inference systems. Previously, he worked on acceleration and application of Graph Neural Networks.
Avatar photo

Posts by Hongkuan Zhou

Agentic AI / Generative AI

DynoSim: Simulating the Pareto Frontier

Modern LLM serving is hard to tune because each deployment is a stack of interacting choices: model backend, tensor-parallel shape, prefill/decode split, worker... 12 MIN READ
Three icons, with text LLMs, Optimize, Deploy.
Data Center / Cloud

NVIDIA Dynamo Adds GPU Autoscaling, Kubernetes Automation, and Networking Optimizations

At NVIDIA GTC 2025, we announced NVIDIA Dynamo, a high-throughput, low-latency open-source inference serving framework for deploying generative AI and reasoning... 7 MIN READ