Max Xu

Max Xu is a senior technical lead at NVIDIA specializing in AI training and inference at scale, performance engineering, and end-to-end application deployment. He brings full-stack GPU expertise spanning from chip design, CUDA and kernel-level development to server and cloud for model training and inference, translating innovations into real-world impact. Before NVIDIA, Max worked in engineering roles across major CSP and semiconductor companies.
Avatar photo

Posts by Max Xu

AI Platforms / Deployment

Pruning and Distilling LLMs Using NVIDIA TensorRT Model Optimizer

Large language models (LLMs) have set a high bar in natural language processing (NLP) tasks such as coding, reasoning, and math. However, their deployment... 11 MIN READ