Yilin Fan

Yilin Fan is a senior deep learning engineer at NVIDIA focusing on TensorRT/TensorRT-LLM performance. He has a general interest in deep learning inference acceleration. Prior to joining NVIDIA, he worked at Pony.ai, optimizing/deploying DL models on autonomous vehicles. Yilin received his master's degree in software engineering from Carnegie Mellon University and a bachelor's degree from Beihang University in Beijing.
Avatar photo

Posts by Yilin Fan

Data Center / Cloud

Blackwell Breaks the 1,000 TPS/User Barrier With Meta’s Llama 4 Maverick

NVIDIA has achieved a world-record large language model (LLM) inference speed. A single NVIDIA DGX B200 node with eight NVIDIA Blackwell GPUs can achieve over... 9 MIN READ