Zhiyu Cheng

Zhiyu Cheng is a tech lead manager at NVIDIA, where he focuses on driving efforts for Large Language Models (LLMs) and diffusion models optimizations for NVIDIA GPUs and cloud services (NeMo/Picasso). He has over 10 years of experience in efficient machine learning and deep learning across his career from NXP, Xilinx, Baidu and OmniML (acquired by NVIDIA). Zhiyu has a record of over 30 published papers and patents. He holds a Ph.D. degree in electrical and computer engineering from the University of Illinois with a thesis in the field of information theory.
Avatar photo

Posts by Zhiyu Cheng

Four images compared against three modes for quality.
Generative AI / LLMs

NVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training Quantization

In the dynamic realm of generative AI, diffusion models stand out as the most powerful architecture for generating high-quality images with text prompts. Models... 7 MIN READ