Yiheng Zhang

Yiheng Zhang is a Software Engineer on the TensorRT team at NVIDIA, where he focuses on MLPerf Inference. Yiheng has experience with autonomous driving software, Jetson platform software optimization, and general performance optimization in MLPerf Inference. Yiheng holds a master's degree in Computer Science from Stanford University.

Posts by Yiheng Zhang

Content Creation / Rendering May 14, 2025

NVIDIA TensorRT Unlocks FP4 Image Generation for NVIDIA Blackwell GeForce RTX 50 Series GPUs

The launch of the NVIDIA Blackwell platform ushered in a new era of improvements in generative AI technology. At its forefront is the newly launched GeForce RTX... 11 MIN READ

Data Center / Cloud Aug 28, 2024

NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1

Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a... 13 MIN READ

An image of an NVIDIA H200 Tensor Core GPU.

Agentic AI / Generative AI Mar 27, 2024

NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records

Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI... 11 MIN READ

Data Center / Cloud Sep 09, 2023

Leading MLPerf Inference v3.1 Results with NVIDIA GH200 Grace Hopper Superchip Debut

AI is transforming computing, and inference is how the capabilities of AI are deployed in the world’s applications. Intelligent chatbots, image and video... 13 MIN READ

Data Center / Cloud Apr 05, 2023

Setting New Records in MLPerf Inference v3.0 with Full-Stack Optimizations for AI

The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment... 15 MIN READ