Jun Yang

Jun Yang is a senior engineering director at NVIDIA, where he focuses on E2E AI workload optimization. Currently, he is leading the overall engineering efforts of NVIDIA TensorRT-LLM. He holds a master’s degree in Computer Architecture from the Institute of Computing Technology Chinese Academy of Sciences.
Avatar photo

Posts by Jun Yang

Data Center / Cloud

NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1

Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a... 13 MIN READ