Ruixiang Wang

Ruixiang Wang is a senior developer technology engineer for LLMs and generative AI at NVIDIA. His current focus is on optimizing AI workloads, including both training and inference, to achieve speed of the light performance on NVIDIA accelerators. He has a strong background in Machine Learning, Deep Learning, NLP, and LLMs. He also assists partners in leveraging the best of NVIDIA's technologies for their AI workloads. He holds an MSc degree in Computer Science from RWTH Aachen University.

Posts by Ruixiang Wang

Edge Computing Jun 09, 2026

Model Quantization: Turn FP8 Checkpoints into High-Performance Inference Engines with NVIDIA TensorRT

This post is the third of a three-part series. See also Model Quantization: Concepts, Methods, and Why It Matters and Model Quantization: Post-Training... 10 MIN READ

Edge Computing May 07, 2026

Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer

This post is the second of a three-part series. See also Model Quantization: Concepts, Methods, and Why It Matters and Model Quantization: Turn FP8 Checkpoints... 8 MIN READ

Edge Computing Nov 24, 2025

Model Quantization: Concepts, Methods, and Why It Matters

This post is the first of a three-part series. See also Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer and Model Quantization: Turn... 12 MIN READ