Hao Wu

Hao Wu is a senior GPU compute architect at NVIDIA. He joined the NVIDIA Compute Architecture group in 2011 after finishing his Ph.D. at the Chinese Academy of Science. Recently, Hao’s technical focus has been applying low precision to deep neural network training and inference.

Posts by Hao Wu

Technical Walkthrough 0

Achieving FP32 Accuracy for INT8 Inference Using Quantization Aware Training with NVIDIA TensorRT

○ TensorRT is an SDK for high-performance deep learning inference and with TensorRT 8.0, you can import models trained using Quantization Aware Training (QAT) to run inference in INT8 precision with... 17 MIN READ
Technical Walkthrough 0

Int4 Precision for AI Inference

INT4 Precision Can Bring an Additional 59% Speedup Compared to INT8 If there’s one constant in AI and deep learning, it’s never-ending optimization to wring… 5 MIN READ