Zhiyu Cheng

Zhiyu Cheng is a tech lead manager at NVIDIA, where he focuses on driving efforts for Large Language Models (LLMs) and diffusion models optimizations for NVIDIA GPUs and cloud services (NeMo/Picasso). He has over 10 years of experience in efficient machine learning and deep learning across his career from NXP, Xilinx, Baidu and OmniML (acquired by NVIDIA). Zhiyu has a record of over 30 published papers and patents. He holds a Ph.D. degree in electrical and computer engineering from the University of Illinois with a thesis in the field of information theory.
Avatar photo

Posts by Zhiyu Cheng

Generative AI

NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support

NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques... 5 MIN READ
Four images compared against three modes for quality.
Generative AI

NVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training Quantization

In the dynamic realm of generative AI, diffusion models stand out as the most powerful architecture for generating high-quality images with text prompts. Models... 7 MIN READ